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Introduction 


Whole brain emulation (WBE), the possible future one-to-one modelling of the function of the 
human brain, is academically interesting and important for several reasons: 


e Research 

o Brain emulation is the logical endpoint of computational neuroscience's 
attempts to accurately model neurons and brain systems. 

o Brain emulation would help us to understand the brain, both in the lead-up 
to successful emulation and afterwards by providing an ideal test bed for 
neuroscientific experimentation and study. 

o Neuromorphic engineering based on partial results would be useful in a 
number of applications such as pattern recognition, AI and brain-computer 
interfaces. 

o As a long-term research goal it might be a strong vision to stimulate 
computational neuroscience. 

o Asa case of future studies it represents a case where a radical future 
possibility can be examined in the light of current knowledge. 

e Economics 

o The economic impact of copyable brains could be immense, and could have 
profound societal consequences (Hanson, 1994, 2008b). Even low probability 
events of such magnitude merit investigation. 

e Individually 

o If emulation of particular brains is possible and affordable, and if concerns 
about individual identity can be met, such emulation would enable back-up 
copies and "digital immortality". 

e Philosophy 

o Brain emulation would itself be a test of many ideas in the philosophy of 
mind and philosophy of identity, or provide a novel context for thinking 
about such ideas. 

o It may represent a radical new form of human enhancement. 


WBE represents a formidable engineering and research problem, yet one which appears to 
have a well-defined goal and could, it would seem, be achieved by extrapolations of current 
technology. This is unlike many other suggested radically transformative technologies like 
artificial intelligence where we do not have any clear metric of how far we are from success. 


In order to develop ideas about the feasibility of WBE, ground technology foresight and 
stimulate interdisciplinary exchange, the Future of Humanity Institute hosted a workshop on 
May 26 and 27, 2007, in Oxford. Invited experts from areas such as computational 
neuroscience, brain-scanning technology, computing, nanotechnology, and neurobiology 
presented their findings and discussed the possibilities, problems and milestones that would 
have to be reached before WBE becomes feasible. 


The workshop avoided dealing with socioeconomic ramifications and with philosophical 
issues such as theory of mind, identity or ethics. While important, such discussions would 
undoubtedly benefit from a more comprehensive understanding of the brain—and it was this 
understanding that we wished to focus on furthering during the workshop. Such issues will 
likely be dealt with at future workshops. 


This document combines an earlier whitepaper that was circulated among workshop 
participants, and additions suggested by those participants before, during and after the 
workshop. It aims at providing a preliminary roadmap for WBE, sketching out key 
technologies that would need to be developed or refined, and identifying key problems or 
uncertainties. 


Brain emulation is currently only a theoretical technology. This makes it vulnerable to 
speculation, "handwaving" and untestable claims. As proposed by Nick Szabo, “falsifiable 
design" is a way of curbing the problems with theoretical technology: 


...the designers of a theoretical technology in any but the most predictable of areas 
should identify its assumptions and claims that have not already been tested in a 
laboratory. They should design not only the technology but also a map of the 
uncertainties and edge cases in the design and a series of such experiments and tests 
that would progressively reduce these uncertainties. A proposal that lacks this 
admission of uncertainties coupled with designs of experiments that will reduce such 
uncertainties should not be deemed credible for the purposes of any important 
decision. We might call this requirement a requirement for a falsifiable design. (Szabo, 
2007) 


In the case of brain emulation this would mean not only sketching how a brain emulator 
would work if it could be built and a roadmap of technologies needed to implement it, but 
also a list of the main uncertainties in how it would function and proposed experiments to 
reduce these uncertainties. 


Itis important to emphasize the long-term and speculative nature of many aspects of this 
roadmap, which in any case is to be regarded only as a first draft —to be updated, refined, 
and corrected as better information becomes available. Given the difficulties and 
uncertainties inherent in this type of work, one may ask whether our study is not premature. 
Our view is that when the stakes are potentially extremely high, it is important to apply the 
best available methods to try to understand the issue. Even if these methods are relatively 
weak, it is the best we can do. The alternative would be to turn a blind eye to what could turn 
out to be a pivotal development. Without first studying the question, how is one to form any 
well-grounded view one way or the other as to the feasibility and proximity of a prospect like 
WBE? 


Thanks to 


We would like to warmly thank the many people who have commented on the paper and 
helped extend and improve it: 


Workshop participants: John Fiala, Robin Hanson, Kenneth Jeffrey Hayworth, Todd 
Huffman, Eugene Leitl, Bruce McCormick, Ralph Merkle, Toby Ord, Peter Passaro, Nick 
Shackel, Randall A. Koene, Robert A. Freitas Jr and Rebecca Roache. 


Other useful comments: Stuart Armstrong. 


The concept of brain emulation 


Whole brain emulation, often informally called “uploading” or “downloading”, has been the 
subject of much science fiction and also some preliminary studies (see Appendix D for history 
and previous work). The basic idea is to take a particular brain, scan its structure in detail, 
and construct a software model of it that is so faithful to the original that, when run on 
appropriate hardware, it will behave in essentially the same way as the original brain. 


Emulation and simulation 


The term emulation originates in computer science, where it denotes mimicking the function 
of a program or computer hardware by having its low-level functions simulated by another 
program. While a simulation mimics the outward results, an emulation mimics the internal 
causal dynamics (at some suitable level of description). The emulation is regarded as 
successful if the emulated system produces the same outward behaviour and results as the 
original (possibly with a speed difference). This is somewhat softer than a strict mathematical 
definition’. 


According to the Church-Turing thesis, a Turing machine can emulate any other Turing 
machine. The physical Church-Turing thesis claims that every physically computable 
function can be computed by a Turing machine. This is the basis for brain emulation: if brain 
activity is regarded as a function that is physically computed by brains, then it should be 
possible to compute it on a Turing machine. Even if true, however, it does not demonstrate 
that it is a computationally feasible process. 


In the following, emulation will refer to a 1-to-1 model where all relevant properties of a 
system exist, while a simulation will denote a model where only some properties exist. 
Emulations may behave differently from each other or the original due to noise or intrinsic 
chaos, but behave within the range of what one would expect from the original if it had 
experienced the same noise or chaos. 


By analogy with a software emulator, we can say that a brain emulator is software (and 
possibly dedicated non-brain hardware) that models the states and functional dynamics of a 
brain at a relatively fine-grained level of detail. 


In particular, a mind emulation is a brain emulator that is detailed and correct enough to 
produce the phenomenological effects of a mind. 


1 A strict definition of simulation might be that a system S consists of a state x(t) evolving by a particular dynamics f, 
influenced by inputs and producing outputs: x(t+1) = f(,x(t)), O(t)=9(x(t)). Another system T simulates S if it produces 
the same output (within a tolerance) for the same input time series starting with a given state (within a tolerance): 
X(t+1)=F(I, X(t), O(0-G(X(t)) where Ix(t)-X(t)| «ei and X(0)=x(0)+ €2. The simulation is an emulation if F=f (up to a 
bijective transformation of X(t)), that is, the internal dynamics is identical and similar outputs are not due to the form 
of G(X(t)). 


Chaotic systems are not simulable by this definition, since after enough time they will diverge if the initial conditions 
differ. Since even a three neuron system can become chaotic (Li, Yu et al., 2001) it is very plausible that the brain 
contains chaotic dynamics and it is not strictly simulable. However, there exists a significant amount of noise in the 
brain that does not prevent meaningful brain states from evolving despite the indeterminacy of their dynamics. A 
"softer" form of emulation may be possible to define that has a model or parameter error smaller than the noise level 
and is hence practically indistinguishable from a possible evolution of the original system. 


A person emulation is a mind emulation that emulates a particular mind. 


What the "relevant properties" are is a crucial issue. In terms of software emulation this is 
often the bits stored in memory and how they are processed. A computer emulator may 
emulate the processor, memory, I/O and so on of the original computer, but does not simulate 
the actual electronic workings of the components, only their qualitative function on the stored 
information (and its interaction with the outside world). While lower-level emulation of 
computers may be possible it would be inefficient and not contribute much to the functions 
that interest us. 


Depending on the desired success criterion emulation may require different levels of detail. It 
might also use different levels of detail in different parts of the system. In the computer 
example, emulating the result of a mathematical calculation may not require simulating the 
execution of all operating system calls for math functions (since these can be done more 
efficiently by the emulating computer's processor) while emulating the behaviour of an 
analogue video effect may require a detailed electronics simulation. 


Little need for whole-system understanding 


An important hypothesis for WBE is that in order to emulate the brain we do not need to 
understand the whole system, but rather we just need a database containing all necessary 
low-level information about the brain and knowledge of the local update rules that change 
brain states from moment to moment. A functional understanding (why is a particular piece 
of cortex organized in a certain way) is logically separate from detail knowledge (how is it 
organised, and how does this structure respond to signals). Functional understanding may be 
a possible result from detail knowlede and it may help gather only the relevant information 
for WBE, but it is entirely possible that we could acquire full knowledge of the component 
parts and interactions of the brain without gaining an insight into how these produce (say) 
consciousness or intelligence. 


Even a database merely containing the complete "parts list" of the brain, including the 
morphology of its neurons, the locations, sizes and types of synaptic connections, would be 
immensely useful for research. It would enable data-driven research in the same way as 
genomics has done in the field of cell biology (Fiala, 2002). 


Computational neuroscience attempts to 
High Total 


understand the brain by making atere 


mathematical or software models of neural 
systems. Currently, the models are usually 
far simpler than the studied systems, with 
the exception of some small neural 
networks such as the lobster stomatogastric 
ganglion (Nusbaum and Beenhakker, 2002) 
and the locomotor network of the lamprey 


spinal cord (Kozlov, Lansner et al., 2007). Quantitative — e Emulation 


Qualitative 
models 


Understanding of 
function 


models 


Often models involve a combination of 





simplified parts (simulated neurons and tow Understanding of High 
synaptic learning rules) and network details 


structures (subsampling of biological Figure 1: Understanding of function vs. 


neurons, simple topologies). Such understanding of details. 


networks can themselves constitute 


learning or pattern recognizing systems on their own, artificial neural networks (ANNs). 
ANN models can be used to qualitatively model, explain and analyze the functions of brain 
systems (Rumelhart, McClelland et al., 1986). Connectionist models build more complex 
models of cognition or brain function on these simpler parts. The end point of this pursuit 
would be models that encompass a full understanding of the function of all brain systems. 
Such qualitative models might not exhibit intelligence or the complexity of human behaviour 
but would enable a formalized understanding of how they come about from simple parts. 


Another approach in computational neuroscience involves creating more biologically realistic 
models, where information about the biological details of neurons such as their 
electrochemistry, biochemistry, detailed morphology and connectivity are included. At its 
simplest we find compartment models of individual neurons and synapses, while more 
complex models include multiple realistic neurons connected into networks, possibly taking 
interactions such as chemical volume transmission into account. This approach can be seen as 
a quantitative understanding of the brain, aiming for a complete list of the biological parts 
(chemical species, neuron morphologies, receptor types and distribution etc.) and modelling 
as accurately as possible the way in which these parts interact. Given this information 
increasingly large and complex simulations of neural systems can be created. WBE represents 
the logical conclusion of this kind of quantitative model: a 1-to-1 model of brain function. 


Note that the amount of functional understanding needed to achieve a 1-to-1 model is small. 
Its behaviour is emergent from the low-level properties, and may or may not be understood 
by the experimenters. For example, if coherent oscillations are important for conceptual 
binding and these emerge from the low-level properties of neurons and their networks, a 
correct and complete simulation of these properties will produce the coherence. 


In practice computational neuroscience works in between quantitative and qualitative 
models. Qualitative models are used to abstract complex, uncertain and potentially irrelevant 
biological data, and often provide significant improvements in simulation processing 
demands (in turn enabling larger simulations, which may enable exploration of domains of 
more interest). Quantitative models are more constrained by known biology, chemistry and 
physics but often suffer from an abundance of free parameters that have to be set (Herz, 
Gollisch et al., 2006). Hybrid models may include parts using different levels of abstraction, 
or exist as a family of models representing the same system at different levels of abstraction. 


The interplay between biological realism (attempting to be faithful to biology), completeness 
(using all available empirical data about the system), tractability (the possibility of 
quantitative or qualitative simulation) and understanding (producing a compressed 
representation of the salient aspects of the system in the mind of the experimenter) will often 
determine what kind of model is used. The appropriate level of abstraction and method of 
implementation depends on the particular goal of the model. In the case of WBE, the success 
criteria discussed below place little emphasis on understanding, but much emphasis on 
qualitatively correct dynamics, requiring much biological realism (up to a point, set by scale 
separation) and the need for data-driven models. Whether such models for whole brain 
systems are tractable from a modelling and simulation standpoint is the crucial issue. 


Brain emulation cannot be achieved without some functional understanding. It needs models 
and theories for recognizing what data is relevant, and would provide data for developing 
and testing these further. While in theory brain emulation might hug the lower line in Figure 
1, in practice it will likely occur somewhere along the right edge - still far below a full 
understanding of the top level phenomena, but including a broad understanding of many 


kinds of low level phenomena. We also need some understanding of higher level phenomena 
to test our simulations and know what kind of data we need to pursue. Fostering the right 
research cycle for developing the right understanding, collecting data, improving 
instrumentation, and experimenting with limited emulations - in addition to providing 
useful services to related fields and beneficial spin-offs — would be indispensable for the 
development of WBE. 


Levels of emulation and success criteria 


"m m T For the brain, several levels of 
Social role-fit Mind emulation Personal success criteria for emulation 
Eu ENS identity can be used. They form a 
emulation . / 
hierarchy extending from 
low-level targets to complete 
emulation. See Table 1 on 


page 11. 


Individual brain 
emulation 


Not shown in this hierarchy 
are emulations of subsystems 
or small volumes of the brain, 


4 
Species “partial brain emulations”. 


generic brain 
emulation 


Properly speaking, a 
complete scan, parts list and 
brain database (1a, 1b and 2) 
do not constitute successful 









3 
Functional 


brain emulation . . 
brain emulation, but such 


achievements (and partial 
brain emulations) would in 
any case be important 





2 


Brain database 3 š 
milestones and useful in 





themselves. 


Similarly, the high-level 






la 1b achievements related to social 
Parts List Complete scan roles, mental states, and 
personal identify (6a, 6b and 
6c) are both poorly- 
Figure 2: Success levels for WBE. understood and hard to 
operationalize, but given the 
philosophical interest in WBE we have included them here for completeness. It is not obvious 
how these criteria relate to one another, or to what extent they might be entailed by the 


criteria for 4 and 5. 


Achieving the third success criterion beyond a certain resolution would, assuming some 
supervenience thesis, imply success of some or all of the other criteria. A full quantum- 
mechanical N-body or field simulation encompassing every particle within a brain would 
plausibly suffice even if “quantum mind” theories are correct. At the very least a 1-to-1 
material copy of the brain (a somewhat inflexible and very particular kind of emulating 
computer) appears to achieve all criteria, possibly excepting those for 6c. However, this is 
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likely an excessively detailed level since the particular phenomena we are interested in (brain 


function, psychology, mind) appear to be linked to more macroscopic phenomena than 


detailed atomic activity. 


Given the complexities and conceptual issues of consciousness we will not examine criteria 
6abc, but mainly examine achieving criteria 1-5. 


Table 1: Success Criteria 























Level Success criterion Relevant properties 
la “Parts list" An inventory of all objects on a particular Low level neural structure, 
size scale, their properties and interactions. chemistry, dynamics accurate to 
resolution level. 
1b “Complete scan" A complete 3D scan of a brain at high Resolution, information enabling 
resolution. structure to function mapping. 
2 “Brain database” Combining the scan and parts list into a 1-to-1 mapping of scan to 
database mapping the low level objects of a simulation/emulation objects. 
brain. 
3 “Functional brain The emulation simulates the objects in a Generically correct causal micro- 
emulation” brain database with enough accuracy to dynamics. 
produce (at least) a substantial range of 
species-typical basic emergent activity of the 
same kind as a brain (e.g. a slow wave sleep 
state or an awake state). 
4 “Species generic brain The emulation produces the full range of Long term dynamics and 
emulation” (human) species-typical emergent behavior adaptation. Appropriate 
and learning capacity. behaviour responses. Full-range 
learning capacity. 
5 “Individual brain The emulation produces emergent activity Correct internal and behaviour 
emulation” characteristic of that of one particular (fully responses. Retains most 


functioning) brain. It is more similar to the 
activity of the original brain than any other 
brain. 


memories and skills of the 
particular brain that was 
emulated. (In an emulation of an 
animal brain, it should be 
possible to recognize the 


particular (familiar) animal.) 











6a “Social role-fit The emulation is able to fill and be accepted | Properties depend on which 
emulation"/"Person into some particular social role, for example (range of) social roles the 
emulation" to perform all the tasks required for some emulation would be able to fit. In 
normally human job. (Socio-economic a limiting case, the emulation 
criteria involved) would be able to pass a 
personalized Turing test: 
outsiders familiar with the 
emulated person would be 
unable to detect whether 
responses came from original 
person or emulation. 
6b ^Mind emulation" The emulation produces subjective mental The emulation is truly conscious 
states (qualia, phenomenal experience) of the | in the same way as a normal 
same kind that would have been produced human being. 
by the particular brain being emulated. 
(Philosophical criteria involved) 
6c "Personal identity The emulation is correctly described as a The emulation is an object of 
emulation" continuation of the original mind; either as prudentially rational self-concern 








numerically the same person, or as a 
surviving continuer thereof. (Philosophical 
criteria involved) 





for the brain to be emulated. 
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Scale separation 


At first it may appear unlikely that a complex system with many degrees of freedom like the 
brain could be modelled with the right causal dynamics, but without taking into account the 
smallest parts. Microstimulation of individual neurons can influence sensory decisions 
(Houweling and Brecht, 2008), showing that very small disturbances can - under the right 
circumstances - scale up to behavioural divergences. 


However, state variables of complex systems can be quantitatively predicted when there is 
'scale separation': when different aspects of the system exist on sufficiently (orders of 
magnitude) different scales (of size, energy, time etc), they can become uncoupled 
(Hillerbrand, 2008). A typical example is how the microscopic dynamics of a laser (atoms 
interacting with an oscillating electromagnetic field) gives rise to a macroscopic dynamics 
(the growth and decay of different laser modes) in such a way that an accurate simulation of 
the system using only elements on the macroscale is possible. Another example is the scale 
separation between electric currents and logic operations in a computer, which enables bit- 
based emulation. When there is no scale separation (such as in fluid turbulence) macroscale 
predictions become impossible without simulating the entire microscale. 


An important issue to be determined is whether such a cut-off exists in the case of the human 
brain and, if it does exist, at what level. While this paper phrases it in terms of 
simulation/emulation, it is encountered in a range of fields (AI, cognitive neuroscience, 
philosophy of mind) in other forms: what level of organisation is necessary for intelligent, 
personal, or conscious behaviour? 


A key assumption of WBE is that, at some intermediary level of simulation resolution 
between the atomic and the macroscopic, there exists at least one cut-off such that meeting 
criteria 1a and 1b at this level of resolution also enables the higher criteria to be met. 


At such a spatial, temporal, or organisational scale, the dynamics on the larger/slower scale is 
not functionally sensitive to the dynamics of the smaller/faster scale. Such scale separation 
might occur at the synaptic scale, where the detailed chemical dynamics underlying synaptic 
function could be replaced by a simplified qualitative model of its effects on signals and 
synaptic strengths. Another possible scale separation level might occur between individual 
molecules and molecular concentration scales: molecular dynamics could be replaced with 
mass-action interactions of concentrations. A perhaps less likely 
separation could also occur on higher levels if what matters is the 
activity of cortical minicolumns rather than individual neurons. A 
final likely but computationally demanding scale or separation 
would be the atomic scale, treating the brain emulation as a N-body 
system of atoms. 1cm 


Conversely, if it could be demonstrated that there is no such scale, it 
would demonstrate the infeasibility of whole brain emulation. Due 
to causally important influence from smaller scales in this case, a 
simulation at a particular scale cannot become an emulation. The 
causal dynamics of the simulation is not internally constrained, so it Vw iiia 
is not a 1-to-1 model of the relevant dynamics. Biologically 


interesting simulations might still be possible, but they would be D iones 


local to particular scales and phenomena, and they would not fully 
reproduce the internal causal structure of the whole brain. 


Figure 3: Size scales in 
the nervous system. 
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Simulation scales 


The widely reproduced diagram from (Churchland and Sejnowski, 1992) in Figure 3 depicts 
the various levels of organisation in the nervous system ordered by size scale, running from 
the molecular level to the entire system. Simulations (and possibly emulations) can occur on 
all levels: 


e Molecular simulation (individual molecules) 

e Chemistry simulation (concentrations, law of mass action) 

e Genetic expression 

e Compartment models (subcellular volumes) 

e Whole cell models (individual neurons) 

e Local network models (replaces neurons with network modules such as 
minicolumns) 

e System models 


Another hierarchy was introduced by John Fiala during the workshop, and will be used with 
some adaptations in this document. 


Table 2: Levels of emulation 

















Level 
1 Computational “Classic AT”, high level representations of information and information 
module processing. 
2 Brain region Each area represents a functional module, connected to others according 
connectivity to a (species universal) "connectome" (Sporns, Tononi et al., 2005). 
3 Analog network Neurons populations and their connectivity. Activity and states of 
population model neurons or groups of neurons are represented as their time-averages. This 
is similar to connectionist models using ANNs, rate-model neural 
simulations and cascade models. 





























7 Proteome As above, plus concentrations of proteins and gene expression levels. 

8 States of protein As above, plus quaternary protein structure. 
complexes 

9 Distribution of As above, plus “locome” information and internal cellular geometry. 
complexes 

10 Stochastic behaviour As above plus molecule positions, or a molecular mechanics model of the 
of single molecules entire brain. 

11 Quantum Quantum interactions in and between molecules. 














The amount of understanding needed to accurately simulate the relevant objects tends to 
increase radically for higher (here, low-numbered) levels: while basic mechanics is well 


understood, membrane biophysics is complex, and the computational functions of brain areas 


are likely exceedingly multifaceted. Conversely, the amount of computing power needed 


increases rapidly as we descend towards lower levels of simulation, and may become 
fundamentally infeasible on level 11?. The amount and type data needed to fully specify a 





? But see also the final chapter of (Hameroff, 1987). The main stumbling block of level 11 simulation may not be 
computing hardware or understanding but fundamental quantum limitations on scanning. 


model also changes character between the different levels. Low-level simulations require 
massive quantities of simple information (molecular positions and types) whereas higher 
levels require a smaller amount of very complex information (content of mental processes). 


Each level has its own characteristic size and time scale, restricting the required imaging 
resolution and simulation timestep. For example, synaptic spine necks and the thinnest axons 
can be on the order of 50 nm or smaller, requiring imaging on the order of the 5 nanometer 
scale to resolve them. 








An informal poll among workshop attendees produced a range of estimates of the required 
resolution for WBE is. The consensus appeared to be level 4-6. Two participants were more 
optimistic about high level models, while two suggested that elements on level 8-9 may be 
necessary at least initially (but that the bulk of mature emulation, once the basics were 
understood, could occur on level 4-5). To achieve emulation on this level, the consensus was 
that 5x5x50 nm scanning resolution would be needed. This roadmap will hence focus on level 
4-6 models, while being open for that deeper levels may turn out to be needed. 





As noted by Fiala, WBE likely requires at least level 4 to capture the specificity of individual 
brains, but probably requires complexity at level 6 or lower to fully capture the 
computational contributions of ion channels, second messengers, protein level adaptation, 
and stochastic synaptic transmission. Other participants thought that at least level 5 would be 
needed for individual brain properties. 


Forecasting 


Analysing the requirements for emulation (in terms of scanning method, number of entities 
to simulate, resulting storage requirements and computational demands) at each of the levels 
provides a way of bounding progress towards WBE. Given these estimates and scenarios of 
future progress in scanning and computing it is possible to calculate the earliest point in time 
where there is enough resources to produce a WBE on a given level at a certain price. As 
better information becomes available such estimates can be refined. 


Although any time estimates will be subject to strong uncertainties, they can be helpful in 
estimating how far away WBE is from policy-relevant timescales, as well as likely timeframes 
for early small-scale emulations. They also allow comparisons to other technology forecasts, 
enabling estimation of the chances for synergies (e.g. the development of molecular 
nanotechnology, which would accelerate WBE progress), reliability (e.g. the further into the 
future, the more unlikely Moore's law is to hold), and the risk of other technologies 
overtaking WBE (e.g. artificial intelligence). 


Early WBE may require lower-level simulation than later forms, as there might not yet be 
enough experience (and test systems) to determine which simulation elements are strictly 
necessary for success. The main concern in this document is estimating when and how WBE 
will first be achieved rather than its eventual mature or “best” form. 
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WBE assumptions 


Philosophical assumptions 


Physicalism (everything supervenes on the physical) is a convenient but not necessary 
assumption, since some non-physicalist theories of mental properties could allow them to 
appear in the case of WBE. Success criterion 6b emulation assumes multiple realizability 
(that the same mental property, state, or event can be implemented by different physical 
properties, states, and events). Sufficient apparent success with WBE would provide 
persuasive evidence for multiple realizability. Generally, emulation up to and including level 
6a does not appear to depend on any strong metaphysical assumptions. 


Computational assumptions 


Computability: brain activity is Turing-computable, or if it is uncomputable, the 
uncomputable aspects have no functionally relevant effects on actual behaviour. 


Non-organicism: total understanding of the brain is not required, just component parts and 
their functional interactions. 


Scale separation: at some intermediary level of simulation resolution between the atomic and 
the macroscopic there exists one (or more) cut-offs such that meeting criterion 2 at this level is 
sufficient for meeting one or more of the higher criteria. 


Component tractability: the actual brain components at the lowest emulated level can be 
understood well enough to enable accurate simulation. 


Simulation tractability: simulation of the lowest emulated level is computationally tractable 
with a practically realizable computer. 


Neuroscience assumptions 


Brain-centeredness: in order to produce accurate behaviour only the brain and some parts of 
the body need to be simulated, not the entire body. 


WBE appears to be a way of testing many of these assumptions experimentally. In acquiring 
accurate data about the structure and function of the brain and representing it as emulations 
it should be possible to find major discrepancies if, for example, Computability is not true. 
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Roadmap 


Requirements 


WBE requires three main capabilities: the ability to physically scan brains in order to acquire 
the necessary information, the ability to interpret the scanned data to build a software model, 
and the ability to simulate this very large model. These in turn require a number of 
subcapabilities (Table 3: Capabilities needed for WBE). 


Plausible scanning methods require ways of preparing the brains, in particular separation 
from other tissue, fixation and possibly dyeing. There is also a need for methods of physically 
handling and storing pieces of tissue: since most scanning methods cannot image large 
volumes the brains will have to be sectioned into manageable pieces. This must allow 
corresponding cells and dendrites to be identified on both sides. While fixation and 
sectioning methods are commonly used in neuroscience, the demands for whole brain 
emulation are stricter: much larger volumes must be handled with far less tolerance for 
damage. 


Imaging methods are discussed in more detail in the chapter on scanning. The three key 
issues are achieving the necessary resolution to image the smallest systems needed for an 
emulation, the ability to image (not necessarily simultaneously) the entire brain, and the 
ability to acquire the functionally relevant information. 


Translating the data from the imaging process into software requires sophisticated image 
processing, the ability to interpret the imagery into simulation-relevant parameters, and 
having a computational neuroscience model of sufficient precision. The image processing will 
have to deal with the unavoidable artefacts from scanning such as distortions and noise, as 
well as occasional lost data. It will likely include methods of converting direct scan data into 
more compressed forms, such as traced structures, in order to avoid excessive storage needs. 
The scan interpretation process makes use of this data to estimate the connectivity, and to 
identify synaptic connections, cell types and simulation parameters. It then places this 
information in an "inventory database" for the emulation. These steps are discussed in the 
image processing chapter. 


The software model requires both a mathematical model of neural activity and ways of 
efficiently implementing such models on computers (discussed in the chapter on neural 
simulation). Computational neuroscience aims at modelling the behaviour of neural entities 
such as networks, neurons, synapses and learning processes. For WBE, it needs to have 
sufficiently good models of all relevant kinds of subsystems, along with the relevant 
parameters set from scan data in order to construct a computational model of the actual brain 
that was scanned. 


To emulate a brain, we need enough computing power to run the basic emulation software, a 
sufficiently realistic body simulation, and possibly a simulated environment. The key 
demands are for memory storage to hold the information and processor power to run it at a 
suitable speed. The massive parallelism of the problem will put some significant demands on 
the internal bandwidth of the computing system. 


In addition, WBE likely requires the development of three supporting technology areas, with 
which it has a symbiotic relationship. First, validation methods to check that other steps 
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produce accurate data and models. This includes validation of scanning, validation of scan 
interpretation, validation of neuroscience models, validations of implementation, and ways of 
testing the success of WBE. While ordinary neuroscience research certainly aims at validation, 
it does not systematize it. For a complex multi-step research effort like WBE, integrated 
validation is likely necessary to ensure that bad data or methods do not confuse later steps in 
the process. Second, WBE requires significant low-level understanding of neuroscience in 
order to construct the necessary computational models and scan interpretation methods. This 
is essentially a continuation and strengthening of systems biology and computational 
neuroscience aiming at a very complete description of the brain on some size or functional 
scale. Third, WBE is large-scale neuroscience, requiring methods of automating 
neuroscientific information gathering and experimentation. This will reduce costs and 
increase throughput, and is necessary in order to handle the huge volumes of data needed. 
Large-scale/industrial neuroscience is clearly relevant for other neuroscience projects too. 
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Figure 4: Technological capabilities needed for WBE. 
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Table 3: Capabilities needed for WBE 





Scanning 


Preprocessing/fixation 


Preparing brains appropriately, 
retaining relevant 
microstructure and state 





Physical handling 


Methods of manipulating fixed 
brains and tissue pieces before, 
during, and after scanning 





Imaging 


Volume 


Capability to scan entire brain 
volumes in reasonable time 
and expense. 





Resolution 


Scanning at enough resolution 
to enable reconstruction 





Functional information 


Scanning is able to detect the 
functionally relevant properties 
of tissue 





Translation 


Image processing 


Geometric adjustment 


Handling distortions due to 
scanning imperfection 





Data interpolation 


Handling missing data 





Noise removal 


Improving scan quality 





Tracing 


Detecting structure and 
processing it into a consistent 
3D model of the tissue 





Scan interpretation 


Cell type identification 


Identifying cell types 





Synapse identification 


Identifying synapses and their 
connectivity 





Parameter estimation 


Estimating functionally 
relevant parameters of cells, 
synapses, and other entities 





Databasing 


Storing the resulting inventory 
in an efficient way 





Software model of neural 
system 


Mathematical model 


Model of entities and their 
behaviour 








Efficient implementation 


Implementation of model 








Simulation 


Storage 


Storage of original model and 
current state 





Bandwidth 


Efficient inter-processor 
communication 





CPU 


Processor power to run 
simulation 





Body simulation 


Simulation of body enabling 
interaction with virtual 
environment or through robot 








Environment simulation 





Virtual environment for virtual 
body 





Linkages 


Most of the capabilities needed for WBE are independent of each other, or form synergistic 


clusters. Clusters of technologies develop together, supporting and motivating each other 


with their output. A typical example is better mathematical models stimulating a need for 


better implementations and computing capacity, while improvements in the latter two 


stimulate interest in modelling. Another key cluster is 3D microscopy and image processing, 
where improvements in one makes the other more useful. 


There are few clear cases where a capability needs a completed earlier capability in order to 


begin development. Current fixation and handling methods are likely unable to meet the 
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demands of WBE-level 3D microscopy, but are good enough to enable early development for 
certain small systems. Scan interpretation needs enough scan data to develop methods, but 
given current research the bottleneck appears to be more on the image processing and 
interpretation side than data availability. Achieving large volume scanning requires 
parallelization and scaling up previous scanning methods, for example by using robotic work 
and parallel microscopy. This requires researchers thinking in terms of, and having 
experience with, an "industrial" approach to data collection. 


This interlinked nature of the field avoids any obvious technology thresholds and 
bottlenecks. There is no one technology that must be developed before other technologies can 
advance. Development can occur on a broad front simultaneously, and rapid progress in a 
field can promote feedback progress in related fields. Unfortunately, it also means that slow 
progress in one area may hold back other areas, not just due to lack of results but by reduced 
demand for their findings, reduced funding, and focus on research that does not lead in the 
direction of WBE. 


Roadmap 


Based on these considerations we can sketch out a roadmap with milestones, required 
technologies, key uncertainties and external technology interactions. 


Approach to WBE has two phases. The first phase consists of developing the basic capabilities 
and settling key research questions that determine the feasibility, required level of detail and 
optimal techniques. This phase mainly involves partial scans, simulations and integration of 
the research modalities. 


The second phase begins once the core methods have been developed and an automated 
scan-interpretation-simulate pipeline has been achieved. At this point the first emulations 
become possible. If the developed methods prove to be scalable they can then be applied to 
increasingly complex brains. Here the main issue is scaling up techniques that have already 
been proven on the small scale. 
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Figure 5: WBE roadmap. 


The key milestones are: 























Ground truth models: a set of cases where the biological “ground truth” is known and can be 
compared to scans, interpretations and simulations in order to determine their accuracy. 


Determining appropriate level of simulation: this includes determining whether there exists 
any suitable scale separation in brains (if not, the WBE effort may be severely limited), and if 


so, on what level. This would then be the relevant scale for scanning and simulation. 


Full cell simulation: a complete simulation of a cell or similarly complex biological system. 
While strictly not necessary for WBE it would be a test case for large-scale simulations. 
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Body simulation: an adequate simulation for the model animal's body (and environment). 
Ideally demonstrated by “fooling” a real animal connected to it. 


Simulation hardware: special-purpose simulation/emulation computer hardware may be 
found to be necessary or effective. 


Organism simulation: a simulation of an entire organism in terms of neural control, body 
state and environmental interaction. This would not be a true emulation since it is not based 
on any individual but rather known physiological data for the species. This would enable 
more realistic and individual models as scans, models and computer power improves. 


Demonstration of function deduction: demonstrating that all relevant functional properties 
on a level can be deduced from scan data. 


Complete inventory: a complete database of entities at some level of resolution for a neural 
system, e.g. not just the connectivity of the C. elegans nervous system but also the 
electrophysiology of the cells and synapses. This would enable full emulation if all the update 
rules are known. It demonstrates that the scanning and translation methods have matured. 


Automated pipeline: a system able to produce a simulation based on an input tissue sample, 
going through the scan, interpretation and simulation steps without major human 
intervention. The resulting simulation would be based on the particular tissue rather than 
being a generic model. 


Partial emulation: A complete emulation of neural system such as the retina, invertebrate 
ganglia or a V1 circuit based on scanned and interpreted data from a brain rather than species 
data. This would demonstrate the feasibility of data-driven brain emulation. 


Eutelic organism emulation: a complete emulation of a simple organism, such as C. elegans or 
another eutelic (fixed nervous system) organism using data from pipeline scanning. It may 
turn out that it is unnecessary to start with a eutelic organism and the first organism 
emulation would be a more complex invertebrate. 


Invertebrate WBE: Emulation of an invertebrate such as a snail or an insect, with learning. 
This would test whether the WBE approach can produce appropriate behaviours. If the 
scanned individual was trained before scanning, retention of trained responses can be 
checked. 


Small mammal WBE: Demonstration of WBE in mice or rats, proving that the approach can 
handle mammalian neuroanatomy. 


Large mammal WBE: Demonstration in higher mammals, giving further information about 
how well individuality, memory and skills are preserved as well as investigation of safety 


concerns. 


Human WBE: Demonstration of an interactive human emulation. 
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Figure 6: Technology drivers for WBE-necessary technologies. 


Different required technologies have different support and drivers for development. 
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Computers are developed independently of any emulation goal, driven by mass market 
forces and the need for special high performance hardware. Moore's law and related 
exponential trends appear likely to continue some distance into the future, and the feedback 
loops powering them are unlikely to rapidly disappear (see further discussion in Appendix B: 
Computer Performance Development). There is independent (and often sizeable) investment 
into computer games, virtual reality, physics simulation and medical simulations. Like 
computers, these fields produce their own revenue streams and do not require WBE-specific 
or scientific encouragement. 


A large number of the other technologies, such as microscopy, image processing, and 
computational neuroscience are driven by research and niche applications. This means less 
funding, more variability of the funding, and dependence on smaller groups developing 
them. Scanning technologies are tied to how much money there is in research (including 
brain emulation research) unless medical or other applications can be found. Validation 
techniques are not widely used in neuroscience yet, but could (and should) become standard 
as systems biology becomes more common and widely applied. 


Finally there are a few areas relatively specific to WBE: large-scale neuroscience, physical 
handling of large amounts of tissue blocks, achieving high scanning volumes, measuring 
functional information from the images, automated identification of cell types, synapses, 
connectivity and parameters. These areas are the ones that need most support in order to 
enable WBE. 


The latter group is also the hardest to forecast, since it has weak drivers and a small number 
of researchers. The first group is easier to extrapolate by using current trends, with the 
assumption that they remain unbroken sufficiently far into the future. 


Uncertainties and alternatives 


The main uncertainties in the roadmap are: 


Does scale separation enabling WBE occur? This is a basic science question, and if scale 
separation does not occur at a sufficiently high scale-level then WBE would be severely 
limited or infeasible. It is possible that progress in understanding complex systems in general 
will help clarify the situation. The question of which complex systems are simulable under 
which conditions is of general interest for many fields. However, in relation to WBE the 
answer seems most likely to come from advances in computational neuroscience and from 
trial-and-error. By experimenting with various specialized neural emulations, at different 
levels of resolution, and comparing the functionally-relevant computational properties of the 
emulation with those of the emulated subsystem, we can test whether a given emulation is 
successful. If so, we can infer that a sufficient scale separation for that subsystem exists at (or 
above) the scale level (granularity) used in the emulation. We emphasize that a successful 
emulation need not predict all details of the original behavior of the emulated system; it need only 
replicate computationally relevant functionality at the desired level of emulation 


What levels are possible/most appropriate for emulation? This will determine both the 
requirements for scanning and emulation software. In order to discover it, small-scale 
scanning and simulation projects need to be undertaken, developing skills and methods. 
Later, full-scale emulations of small systems will test whether the estimates hold. If an early 
answer can be found, efforts can focus on this level; otherwise the WBE research front would 
have to work on multiple levels. 
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How much of the functionally relevant information can be deduced from scanning in a 
particular modality (e.g. electron microscopy)? At present, electron microscopy appears to 
be the only scanning method that has the right resolution to reach synaptic connectivity, but 
it is limited in what chemical state information it can reveal. If it is possible to deduce the 
function of a neuron, synapse or other structure through image interpretation methods, then 
scanning would be far simpler than if this is not (in which case some form of hybrid method 
or entirely new scanning modality would have to be developed). This issue appears to form a 
potentially well-defined research question that could be pursued. Answering it would require 
finding a suitable model system for which ground truth (the computational functionality of 
target system) was known, using the scanning modality to produce imagery and then testing 
out various forms of interpretation on the data. 
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Figure 7: WBE computational biology research cycle (based on (Takahashi, Yugi et al. 
2002)). WBE introduces large-scale scanning and simulation into the cycle. 


Developing the right research cycle. The computational biology research cycle today 
involves wet experiments providing cellular data and hypotheses, which drive qualitative 
modelling. This modelling in turn is used in quantitative modelling, which using simulations 
generate data that can be analysed and employed to refine the models, compare with the 
experiments, and suggest new experiments (Takahashi, Yugi et al., 2002). The WBE paradigm 
incorporates this research cycle (especially in the software modelling part), but includes two 
new factors. One is large-scale scanning and processing of brain tissue, providing massive 
amounts of data as input to the cycle, but also requiring the models to guide the development 
of scanning methods. The second is the large-scale data driven simulations that do not aim at 
producing just hypothesis testing and model refinement, but also at accurately mimicking the 
wet system. Both factors will increase and change the demands for data management, 
hypothesis searching and simulation analysis/interpretation. They will also introduce 
sociological and interdisciplinary factors, as different academic disciplines with very different 
methodologies will have to learn how to communicate and cooperate. In order to be viable, 
field research methods of testing, data-sharing, validation, and standards for what constitutes 
a result must be developed so that the extended cycle provides incentives for all participants 
to cooperate and push the technology forward. This is closely linked to the likely move 
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towards large-scale neuroscience, where automated methods will play an increasingly 
prominent role as they have done in genomics. 





Figure 8: Caenorhabditis elegans, a popular model organism with a fully mapped 302 
neuron nervous system. 


Selection of suitable model systems. Selecting the right targets for scanning and modelling 
will have to take into account existing knowledge, existing research communities, likelihood 
of funding and academic impact as well as practical factors. While the C. elegans nervous 
system has been completely mapped (White, Southgate et al., 1986), we still lack detailed 
electrophysiology, likely because of the difficulty of investigating the small neurons. Animals 
with larger neurons may prove less restrictive for functional and scanning investigation but 
may lack sizeable research communities. Molluscs such as freshwater snails and insects such 
as fruit flies may be promising targets. They have well characterised brains, existing research 
communities and neural networks well within current computational capabilities. 


Similarly, the selection of subsystems of the brain to study requires careful consideration. 
Some neural systems are heavily studied (cortical columns, the visual system, the 
hippocampus) and better data about them would be warmly received by the research 
community, yet the lack of characterization of their inputs, outputs and actual function may 
make development of emulation methods very hard. One system that may be very promising 
is the retina, which has an accessible geometry, is well studied and somewhat well 
understood, is not excessively complex, and better insights into which would be useful to a 
wide research community. Building on retinal models, models of the lateral geniculate 
nucleus and visual cortex may be particularly useful, since they would both have relatively 
well-defined inputs from the previous stages. 


At what point will the potential be clear enough to bring major economic actors into WBE? 
Given the potentially huge economic impact of human WBE (Hanson, 1994, 2008a, 2004, 
2008b), if the field shows sufficient promise, major economic actors will become interested in 
funding and driving the research as an investment. It is unclear how far advanced the field 
would need to be in order to garner this attention. Solid demonstrations of key technologies 
are likely required, as well as a plausible path towards profitable WBE. The impact of funding 
on progress will depend on the main remaining bottlenecks and their funding elasticity. If 
scanning throughput or computer power is the limiting factor, extra funding can relatively 
easily scale up facilities. By contrast, limitations in neuroscience understanding are less 
responsive to investment. If funding arrives late, when the fundamental problems have 
already been solved, the amplified resources would be used to scale up pre-existing small- 
scale demonstration projects. 
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Intellectual property constitutes another important consideration for commercial funding: 
what could early developers own, how secure and long-term would their investment be? 
Without solid prospects of having preferential ownership of what it develops, a firm is 
unlikely to pursue the project. 


Alternative pathways 


Special hardware for WBE. It is possible that WBE can be achieved more efficiently using 
dedicated hardware rather than generic hardware. Such performance gains are possible if, for 
example, there is a close mapping between the hardware and the brain, or if the functions of 
the emulation software could be implemented efficiently physically. Developing such 
dedicated hardware would be costly unless other applications existed (which is why interest 
in dedicated neural network chips peaked in the early 1990's). 


Dedicated neural network chips have reached up to 1.7 billion synaptic updates (and 337 
million synaptic adjustments) per second for ANN models (Kondo, Koshiba et al., 1996), 
which is approaching current supercomputing speeds for more complex models. Recently, 
there has been some development of FPGAs (Field-Programmable Gate Arrays) for running 
complex neuron simulations, producing an order of magnitude faster simulation for a 
motorneuron than a software implementation (four times real-time, 8M compartments/s) 
(Weinstein and Lee, 2005). A FPGA implementation has the advantage of being 
programmable, not requiring WBE-special purpose hardware. Another advantage include 
that as long as there is chip space, more complex models do not require more processing time 
and that precision can be adjusted to suit the model and reduce space requirements. 
However, scaling up to large and densely interconnected networks will require developing 
new techniques (Weinstein and Lee, 2006). A better understanding of the neocortical 
architecture may serve to produce hardware architectures that fit it well (Daisy project, 2008). 
It has been suggested that using FPGAs could increase computational speeds in network 
simulations by up to two orders of magnitude, and in turn enable testing grounds for 
developing special purpose WBE chips (Markram, 2006). 


It may also be possible to use embedded processor technology to manufacture large amounts 
of dedicated hardware relatively cheaply. A study of high resolution climate modelling in the 
petaflop range found a 24- to 34-fold reduction of cost and about two orders of magnitude 
smaller power requirements using a custom variant of embedded processor chips (Wehner, 
Oliker et al., 2008). 


This roadmap is roughly centred on the assumption that scanning technology will be similar 
to current microscopy, developed for large-scale neuroscience, automated sectioning of 
fixated tissue and local image-to-model conversion. For reasons discussed in the scanning 
section, non-destructive scanning of living brains appears to be hard compared to the “slice- 
and-dice" approach where we have various small-scale existence proofs. However, as pointed 
out by Robert Freitas Jr., nanomedical techniques could possibly enable non-destructive 
scanning by use of invasive measurement devices. Even if such devices prove infeasible, 
molecular nanotechnology could likely provide many new scanning methodologies as well 
as radical improvement of neuroscientific research methods and the efficiency of many 
roadmap technologies. Even far more modest developments such as single molecule analysis, 
nanosensors, artificial antibodies and nanoparticles for imaging (which are expected to be in 
use by 2015 (Nanoroadmap Project, 2006)) would have an important impact. Hence early or 
very successful nanotechnology would offer faster and alternative routes to achieve the 
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roadmap. Analysing the likelihood, timeframe, and abilities of such nanomedicine is outside 
the scope of this document. 


One possible scanning alternative not examined much here is high resolution scanning of 
frozen brains using MRI. This might be a complement to destructive scanning, but could 
possibly gain enough information to enable WBE. However, we currently have little 
information on the limits and possibilities of the technique (see discussion in Appendix E: 
Non-destructive and gradual replacement). 


As discussed in the overview, WBE does not assume any need for high-level understanding 
of the brain or mind. In fact, should such understanding be reached it is likely that it could be 
used to produce artificial intelligence (AT). Human-level AI (or superintelligent AT) would 
not necessarily preclude WBE, but some of the scientific and economic reasons would vanish, 
possibly making the field less relevant. On the other hand, powerful AI could greatly 
accelerate neuroscience advances and perhaps help develop WBE for other purposes. 
Conversely, success in some parts of the WBE endeavour could help AI, for example if 
cortical microcircuitry and learning rules could be simulated efficiently as a general 
learning/behaviour system. 


The impact and development of WBE will depend on which of the main capabilities 
(scanning, interpretation, simulation) develop last. If they develop relatively independently 
it would be unlikely for all three to mature enough to enable human-level emulations at the 
same time. If computing power is the limiting factor, increasingly complex animal emulations 
are likely to appear. Society has time to adapt to the prospect of human-level WBE in the near 
future. If scanning resolution, image interpretation, or neural simulation is the limiting factor, 
a relatively sudden breakthrough is possible: there is enough computing power, scanning 
technology, and software to go rapidly from simple to complex organisms, using relatively 
small computers and projects. This could lead to a surprise scenario wherein society has little 
time to consider the possibility of human-level WBE. If computing power is the limiting 
factor, or if scanning is the bottleneck due to lack of throughput, then the pace of 
development would likely be economically determined: if enough investment were made, 
WBE could be achieved rapidly. This would place WBE enablement under political or 
economical control to a greater degree than in the alternative scenarios. 


Related technologies and spin-offs 


The technologies needed to achieve WBE include the ability to scan organic tissue on a low 
level, interpret the findings into functional models, and run extremely large-scale 
simulations. WBE also requires sufficient knowledge of low-level neural function. 


The desire for running extremely large simulations has been a strong motivator for 
supercomputing. Applications in nanotechnology, virtual reality, cryptography, signal 
processing, mathematics, genomics, and simulation of transportation, societies, business, 
physics, biology, and climate science will require petaflops performance in the next decade. It 
is unlikely that this will be the end, and exaflop performance is already being discussed in the 
supercomputing community. However, scaling up current architectures to the exascale may 
be problematic, and may require new ways of thinking about how to manage complex 
concurrent systems. These will to a large degree be shaped by the problems the computers 
are intended to solve (and the relative ranking of these problems by society, affecting 
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funding) as well as by tradeoffs between performance with price, energy requirements?, and 
other constraints. The related area of very large-scale information storage is also a key issue 
for WBE. 


An obvious related technology is the creation of virtual body models for use in medicine. 
They can be used as training and study objects, or, at a more advanced stage, as personal 
medical models. Such models might enable doctors to investigate the current state of the 
patient, compare it to previous data, test or optimise various forms of simulated treatment, 
train particular surgical approaches, etc. This is a technology of significant practical 
importance that is driven by current advances in medical imaging, the personalisation of 
healthcare, and improving physiological models. While most current models are either static 
datasets or high-level physiological models, personalisation requires developing methods of 
physiological parameter estimation from the data. Simulation will promote the development 
of more realistic body models usable for WBE. Conversely, the WBE focus on data-driven 
bottom-up simulation appears to fit in with developing personalised biochemical and tissue 
models. 


Virtual test animals, if they could be developed to a sufficient degree of realism, would be 
high in demand as faster and less expensive testing grounds for biomedical hypotheses 
(Michelson and Cole, 2007; Zheng, Kreuwel et al., 2007). They may perhaps also be a way of 
avoiding the ethical controversies surrounding animal testing (although it is not 
inconceivable that concerns about animal welfare would in time be extended to emulated 
animals). This could provide an impetus for the development of organism emulation 
techniques and especially validation methods that help guarantee that the virtual animals 
have the same properties as real animals would. 


Overall, there is increasing interest and capability in quantitative modelling of biology (Di 
Ventura, Lemerle et al., 2006). While much effort has gone into metabolic or control networks 
of particular interest, there is a push towards cell models encompassing metabolism, the 
genome and proteomics (Ortoleva, Berry et al., 2003; Tomita, 2001; Schaff, Slepchenko et al., 
2001; Tyson, 2001). There is also interest in building simulators for modelling self-assembly in 
subcellular systems (Ortoleva, Berry et al., 2003). Besides predicting biological responses and 
helping to understand biology, modelling will likely become important for synthetic biology 
(Serrano, 2007). 


In order to produce realistic virtual stimuli for a brain emulation, accurate simulations of the 
sensory input to the brain are needed. The same techniques used to investigate neural 
function can be applied to the senses. It has also been proposed that technology to record 
normal neural activity along the brain nerves and spinal cord could be helpful. Such data, for 
example recorded using massively parallel electrode arrays or nanoimplants, would provide 
good test data to validate a body model. Recorded sensory data would also be repeatable, 
enabling comparisons between different variants of the same emulation. Neural interfacing is 
also useful for developing better robotic body models. Currently, most interest in neural 
interfaces is focused on helping people with disabilities use sensory or motoric prosthetics. 
While neural interfacing for enhancement purposes is possible, it is unlikely to become a 
significant driver until prosthetic systems have become cheap, safe, and very effective. 





3 An exascale computer using 2008 technology would require tens of megawatts (Sandia National Laboratories, 2008). 
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Issues 


Emulation systems 


A functioning brain emulation will include, in 
addition to the brain model (the main part), 
some way for the brain model to experience 
bodily interactions with an environment. There 
are two different ways in which this could be 
accomplished: via a simulated virtual body 
inhabiting a virtual reality (which can be linked 
to the outside world); or via a hardware body 
connected to the brain model via a body 
interface module. 


Entry-level WBE does not require the capacity to 
accommodate all of the original sensory 
modalities or to provide a fully naturalistic body 
experience. Simulated bodies and worlds, or 


hardware body interfaces, as well as 
communications with the outside world, are not 
necessary per se for brain emulation except 
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Figure 9: Emulation system for completely 


virtual emulation. 


insofar they are needed to maintain short-term function of the brain. For long-term function, 


especially of human mind emulations, embodiment and communication are important. 


Sensory or motor deprivation appears to produce intellectual and perceptual deficits within a 


few days time (Zubek and Macneill, 1966). 


The brain emulator performs the actual 
emulation of the brain and closely linked 
subsystems such as brain chemistry. The result 
of its function is a series of states of emulated 
brain activity. The emulation produces and 
receives neural signals corresponding to motor 
actions and sensory information (in addition, 
some body state information such as glucose 
levels may be included). 


The body simulator contains a model of the 
body and its internal state. It produces sensory 
signals based on the state of the body model 
and the environment, sending them to the 
brain emulation. It converts motor signals to 
muscle contractions or direct movements in the 


body model. The degree to which different 
parts of the body require accurate simulation is 
likely variable. 
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emulations. 


The environment simulator maintains a model of the surrounding environment, responding 


to actions from the body model and sending back simulated sensory information. This is also 
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the most convenient point of interaction with the outside world. External information can be 
projected into the environment model, virtual objects with real world affordances can be used 
to trigger suitable interaction etc. 


The overall emulation software system (the “exoself” to borrow Greg Egan’s term) would 
regulate the function of the simulators and emulator, allocate computational resources, collect 
diagnostic information, provide security (e.g. backups, firewalls, error detection, encryption) 
and so on. It could provide software services to emulated minds (accessed through the virtual 
environment) and/or outside experimenters. 


A variant of the above system would be an embodied brain emulation, in which case the 
body simulator would merely contain the translation functions between neural activity and 
physical signals, and these would then be actuated using a hardware body. The body might 
be completely artificial (in which case motor signals have to be mapped onto appropriate 
body behaviours) or biological but equipped with nerve-computer interfaces enabling 
sensing and control. The computer system running the emulation does not have to be 
physically present in the body. 


Itis certainly possible to introduce signals from the outside on higher levels than in a 
simulated or real body. It would be relatively trivial to add visual or auditory information 
directly to the body model and have them appear as virtual or augmented reality. 
Introducing signals directly into the brain emulation would require them to make sense as 
neural signals (e.g. brain stimulation or simulated drugs). "Virtual brain-computer interfaces" 
with perfect clarity and no risk of side effects could be implemented as extensions of the body 
simulation/interface. 


If computing power turns out to be a bottleneck resource, then early emulations are likely to 
run more slowly than the biological system they aim at emulating. This would limit the 
ability of the emulations to interact in real-time with the physical world. In distributed 
emulations delays between computing nodes put a strong limit on how fast they can become. 
The shortest delay using 100 m/s axons across the brain is about 1 ms. Assuming light speed 
communication, processing nodes cannot be further away than 300 km if longer delays are to 
be avoided. 


Complications and exotica 


Beside straight neural transmission through synapses there may be numerous other forms of 
information processing in the brain that may have to be emulated. How important they are 
for success in emulation remains uncertain. An important application of early brain 
emulations and their precursors will be to enable testing of their influence. 


Spinal cord 


While traditionally the vertebrate spinal cord is often regarded as little more than a bundle of 
motor and sensor axons together with a central column of stereotypical reflex circuits and 
pattern generators, there is evidence that the processing may be more complex (Berg, 
Alaburda et al., 2007) and that learning processes occur among spinal neurons (Crown, 
Ferguson et al., 2002). The networks responsible for standing and stepping are extremely 
flexible and unlikely to be hardwired (Cai, Courtine et al., 2006). 
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This means that emulating just the brain part of the central nervous system will lose much 
body control that has been learned and resides in the non-scanned cord. On the other hand, it 
is possible that a generic spinal cord network would, when attached to the emulated brain, 
adapt (requiring only scanning and emulating one spinal cord, as well as finding a way of 
attaching the spinal emulation to the brain emulation). But even if this is true, the time taken 
may correspond to rehabilitation timescales of (subjective) months, during which time the 
simulated body would be essentially paralysed. This might not be a major problem for 
personal identity in mind emulations (since people suffering spinal injuries do not lose 
personal identity), but it would be a major limitation to their usefulness and might limit 
development of animal models for brain emulation. 


A similar concern could exist for other peripheral systems such as the retina and autonomic 
nervous system ganglia. 


The human spinal cord weighs 2.5% of the brain and contains around 10+ of the number of 
neurons in the brain (13.5 million neurons). Hence adding the spinal cord to an emulation 
would add a negligible extra scan and simulation load. 


Synaptic adaptation 

Synapses are usually characterized by their "strength", the size of the postsynaptic potential 
they produce in response to a given magnitude of incoming excitation. Many (most?) 
synapses in the CNS also exhibit depression and/or facilitation: a temporary change in release 
probability caused by repeated activity (Thomson, 2000). This rapid dynamics likely plays a 
role in a variety of brain functions, such as temporal filtering (Fortune and Rose, 2001 ), 
auditory processing (Macleod, Horiuchi et al., 2007) and motor control (Nadim and Manor, 
2000). These changes occur on timescales longer than neural activity (tens of milliseconds) but 
shorter than long-term synaptic plasticity (minutes to hours). Adaptation has already been 
included in numerous computational models. The computational load is usually 1-3 extra 
state variables in each synapse. 


Unknown neurotransmitters and neuromodulators 


Not all neuromodulators are known. At present about 10 major neurotransmitters and 200+ 
neuromodulators are known, and the number is increasing. (Thomas, 2006) lists 272 
endogenous extracellular neuroactive signal transducers with known receptors, 2 gases, 19 
substances with putative or unknown binding sites and 48 endogenous substances that may 
or may not be neuroactive transducers (many of these may be more involved in general 
biochemical signalling than brain-specific signals). Plotting the year of discovery for different 
substances (or families of substances) suggests a linear or possibly sigmoidal growth over 
time (Figure 11). 
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Figure 11: Time of discovery of the neurotransmitter or neuromodulator activity of a 
number of substances or families of substances. Data taken from (von Bohlen und Halbach 
and Dermietzel 2006), likely underrepresenting the development after 2000. 


An upper bound on the number of neuromodulators can be found using genomics. About 800 
G-protein coupled receptors can be found in the human genome, of which about half were 
sensory receptors. Many are "orphans" that lack known ligands, and methods of 
“deorphanizing” receptors by expressing them and determining what they bind to have been 
developed. In the middle 1990's about 150 receptors had been paired to 75 transmitters, 
leaving around 150-200 orphans in 2003 (Wise, Jupe et al., 2004). At present, 7-8 receptors are 
deorphanized each year (von Bohlen und Halbach and Dermietzel, 2006); at this rate all 
orphans should be adopted within 720 years, leading to the discovery of around 50 more 
transmitters (Civelli, 2005). 


Similarly guanylyl cyclase-coupled receptors (four orphans, (Wedel and Garbers, 1998)), 
tyrosine kinase-coupled receptors (««100, (Muller-Tidow, Schwable et al., 2004)) and cytokine 
receptors would add a few extra transmitters. 


However, there is room for some surprises. Recently it was found that protons were used to 
signal in C. elegans rhythmic defecation (Pfeiffer, Johnson et al., 2008) mediated using a 
Na*/H* exchanger, and it is not inconceivable that similar mechanisms could exist in the 
brain. Hence the upper bound on all transmitters may be set by not just receptors but also by 
membrane transporter proteins. 
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For WBE modelling all modulatory interactions is probably crucial, since we know that 
neuromodulation does have important effects on mood, consciousness, learning and 
perception. This means not just detecting their existence but to create quantitative models of 
these interactions, a sizeable challenge for experimental and computational neuroscience. 


Unknown ion channels 


Similar to receptors, there are likely unknown ion channels that affect neuron dynamics. 


The Ligand Gated Ion Channel Database currently contains 554 entries with 71 designated as 
channel subunits from Homo sapiens (EMBL-EBI, 2008; Donizelli, Djite et al., 2006). Voltage 
gated ion channels form a superfamily with at least 143 genes (Yu, Yarov-Yarovoy et al., 
2005). This diversity is increased by multimerization (combinations of different subunits), 
modifier subunits that do not form channels on their own but affect the function of channels 
they are incorporated into, accessory proteins as well as alternate mRNA splicing and post- 
translational modification (Gutman, Chandy et al., 2005). This would enable at least an order 
of magnitude more variants. 


Ion channel diversity increases the diversity of possible neuron electrophysiology, but not 
necessarily in a linear manner. See the discussion of inferring electrophysiology from gene 
transcripts in the interpretation chapter. 


Volume transmission 


Surrounding the cells of the brain is the extracellular space, on average 200 À across and 
corresponding to 2076 of brain volume (Nicholson, 2001). It transports nutrients and buffers 
ions, but may also enable volume transmission of signalling molecules. 


Volume transmission of small molecules appears fairly well established. Nitrous oxide is 
hydrophobic and has low molecular weight and can hence diffuse relatively freely through 
membranes: it can reach up to 0.1-0.2 mm away from a release point under physiological 
conditions (Malinski, Taha et al., 1993; Schuman and Madison, 1994; Wood and Garthwaite, 
1994). While mainly believed to be important for autoregulation of blood supply, it may also 
have a role in memory (Ledo, Frade et al., 2004). This might explain how LTP (Long Term 
Potentiation) can induce "crosstalk" that reduces LTP induction thresholds over a span of 10 
um and ten minutes (Harvey and Svoboda, 2007). 


Signal substances such as dopamine exhibit volume transmission (Rice, 2000) and this may 
have effect for potentiation of nearby synapses during learning: simulations show that a 
single synaptic release can be detected up to 20 uum away and with a 100 ms half-life (Cragg, 
Nicholson et al., 2001). Larger molecules have their relative diffusion speed reduced by the 
limited geometry of the extracellular space, both in terms of its tortuosity and its anisotropy 
(Nicholson, 2001). As suggested by Robert Freitas, there may also exist active extracellular 
transport modes. Diffusion rates are also affected by local flow of the CSF and can differ from 
region to region (Fenstermacher and Kaye, 1988); if this is relevant then local diffusion and 
flow measurements may be needed to develop at least a general brain diffusion model. The 
geometric part of such data could be relatively easily gained from the high resolution 3D 
scans needed for other WBE subproblems. 


34 


Rapid and broad volume transmission such as from nitrous oxide can be simulated using a 
relatively coarse spatiotemporal grid size, while local transmission requires a grid with a 
spatial scale close to the neural scale if diffusion is severely hindered. 


For constraining brain emulation it might be useful to analyse the expected diffusion and 
detection distances of the 4200 known chemical signalling molecules based on their molecular 
weight, diffusion constant and uptake (for different local neural geometries and source/sink 
distributions). This would provide information on diffusion times that constrain the diffusion 
part of the emulation and possibly show which chemical species need to be spatially 
modelled. 


Body chemical environment 


The body acts as an input/output unit that interacts with our perception and motor activity. It 
also acts as a chemical environment that affects the brain through nutrients, hormones, 
salinity, dissolved gases, and possibly immune signals. Most of these chemical signals occur 
on a subconscious level and only become apparent when they influence e.g. hypothalamus to 
produce hunger or thirst sensations. For brain emulation, some aspects of this chemical 
environment has to be simulated. 


This would require mapping the human metabolome, at least in regards to substances that 
cross the blood-brain barrier. The metabolome is likely on the order of 2,000-2,500 
compounds (Beecher, 2003; Wishart, Tzur et al., 2007) and largely does not change more 
rapidly than on the second-timescale. This suggests that compared to the demands of the 
WBE, the body chemistry model, while involved, would be relatively simple. 


If a protein interaction model is needed rather than metabolism, then complexity increases. 
According to one estimate the human interactome is around ~650,000 protein-protein 
interactions (Stumpf, Thorne et al., 2008). 


Neurogenesis and remodelling 


Recent results show that neurogenesis persists in some brain regions in adulthood, and might 
have nontrivial functional consequences (Saxe, Malleret et al., 2007). During neurite 
outgrowth, and possibly afterwards, cell adhesion proteins can affect gene expression and 
possible neuron function by affecting second messenger systems and calcium levels (Crossin 
and Krushel, 2000). However, neurogenesis is mainly confined to discrete regions of the brain 
and does not occur to a great extent in adult neocortex (Bhardwaj, Curtis et al., 2006). 


Since neurogenesis occurs on fairly slow timescales (> 1 week) compared to brain activity and 
normal plasticity, it could probably be ignored in brain emulation if the goal is an emulation 
that is intended to function faithfully for only a few days and not to exhibit truly long-term 
memory consolidation or adaptation. 


A related issue is remodelling of dendrites and synapses. Over the span of months dendrites 
can grow, retract and add new branch tips in a cell type-specific manner (Lee, Huang et al., 
2006). Similarly synaptic spines in the adult brain can change within hours to days, although 
the majority remain stable over multi-month timespans (Grutzendler, Kasthuri et al., 2002; 
Holtmaat, Trachtenberg et al., 2005; Zuo, Lin et al., 2005). Even if neurogenesis is ignored and 
the emulation is of an adult brain, it is likely that such remodelling is important to learning 
and adaptation. 
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Simulating stem cell proliferation would require data structures representing different cells 
and their differentiation status, data on what triggers neurogenesis, and models allowing for 
the gradual integration of the cells into the network. Such a simulation would involve 
modelling the geometry and mechanics of cells, possibly even tissue differentiation. Dendritic 
and synaptic remodelling would also require a geometry and mechanics model. While 
technically involved and requiring at least a geometry model for each dendritic compartment 
the computational demands appear small compared to neural activity. 


Glia cells 


Glia cells have traditionally been regarded as merely supporting actors to the neurons, but 
recent results suggest that they may play a fairly active role in neural activity. Beside the 
important role of myelinization for increasing neural transmission speed, at the very least 
they have strong effects on the local chemical environment of the extracellular space 
surrounding neurons and synapses. 


Glial cells exhibit calcium waves that spread along glial networks and affect nearby neurons 
(Newman and Zahs, 1998). They can both excite and inhibit nearby neurons through 
neurotransmitters (Kozlov, Angulo et al., 2006). Conversely, the calcium concentration of glial 
cells is affected by the presence of specific neuromodulators (Perea and Araque, 2005). This 
suggests that the glial cells acts as an information processing network integrated with the 
neurons (Fellin and Carmignoto, 2004). One role could be in regulating local energy and 


oxygen supply. 


If glial processing turns out to be significant and fine-grained, brain emulation would have to 
emulate the glia cells in the same way as neurons, increasing the storage demands by at least 
one order of magnitude. However, the time constants for glial calcium dynamics is generally 
far slower than the dynamics of action potentials (on the order of seconds or more), 
suggesting that the time resolution would not have to be as fine, making the computational 
demands increase far less steeply. 


Ephaptic effects 


Electrical effects may also play a role via so called "ephaptic transmission". In a high 
resistance environment, currents from action potentials are forced to flow through 
neighbouring neurons, changing their excitability. It has been claimed that this process 
constitutes a form of communication in the brain, in particular the hippocampus (Krnjevic, 
1986). However, in most parts of the brain there is a large extracellular space and blocking 
myelin, so even if ephaptic interactions play a role, they do so only locally, e.g. in the 
olfactory system (Bokil, Laaris et al., 2001), dense demyelinated nerve bundles (Reutskiy, 
Rossoni et al., 2003), or trigeminal pain syndromes (Love and Coakham, 2001). It should be 
noted that the nervous system appears relatively insensitive to everyday external electric 
fields (Valberg, Kavet et al., 1997; Swanson and Kheifets, 2006). 


If ephaptic effects were important, the emulation would need to take the locally induced 
electromagnetic fields into account. This would plausibly involve dividing the extracellular 
space (possibly also the intracellular space) into finite elements where the field can be 
assumed to be constant, linear or otherwise easily approximable. The cortical extracellular 
length constant is on order of 24100 um (Gardner-Medwin, 1983), which would necessitate on 
the order of 1.4.10? such compartments if each compartment is 1/10 of the length constant. 
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Each compartment would need at least two vector state variables and 6 components of a 
conductivity tensor; assuming one byte for each, the total memory requirements would be on 
the order of 10 terabytes. Compared to estimates of neural simulation complexity, this is 
relatively manageable. The processing needed to update these compartments would be on the 
same order as a detailed compartment model of every neuron and glia cell. 


Dynamical state 


The methods for creating the necessary data for brain emulation discussed in this paper deal 
with just the physical structure of the brain tissue, not its state of activity. Some information 
such as working memory may be stored just as ongoing patterns of neural excitation and 
would be lost. Similarly, information in calcium concentrations, synaptic vesicle depletion, 
and diffusing neuromodulators may be lost during scanning. A likely consequence would be 
amnesia of the time closest to the scanning. 


However, loss of brain activity does not seem to prevent the return of function and personal 
identity. This is demonstrated by the reawakening of coma patients, and by cold water near- 
drowning cases in which brain activity temporarily ceased due to hypothermia (Elixson, 
1991). 


Quantum computation 


While practically all neuroscientists subscribe to the dogma that neural activity is a 
phenomenon that occurs on a classical scale, there have been proposals (mainly from 
physicists) that quantum effects play an important role in the function of the brain (Penrose, 
1989; Hameroff, 1987). So far there is no evidence for quantum effects in the brain beyond 
quantum chemistry, and no evidence that such effects play an important role for intelligence 
or consciousness (Litt, Eliasmith et al., 2006). There is no lack of possible computational 
primitives in neurobiology nor any phenomena that appear unexplainable in terms of 
classical computations (Koch and Hepp, 2006). Quantitative estimates for decoherence times 
for ions during action potentials and microtubules suggest that they decohere on a timescale 
of 10 — 10-5 s, about ten orders of magnitude faster than the normal neural activity 
timescales. Hence quantum effects are unlikely to persist long enough to affect processing 
(Tegmark, 2000). This, however, has not deterred supporters of quantum consciousness, who 
argue that there may be mechanisms protecting quantum superpositions over significant 
periods (Rosa and Faber, 2004; Hagan, Hameroff et al., 2002). 


If these quantum-mind hypotheses were true, brain emulation would be significantly more 
complex, but not impossible given the right (quantum) computer. In (Hameroff, 1987) mind 
emulation is considered based on quantum cellular automata, which in turn are based on the 
microtubule network that the author suggests underlies consciousness. 


Assuming 7.1 microtubules per square um and 768.9 uum in average length (Cash, Aliev et al., 
2003) and that 1/30 of brain volume is neurons (although given that micotubuli networks 
occurs in all cells, glia - and any other cell type! - may count too) gives 10!6 microtubules. If 
each stores just a single quantum bit this would correspond to a 1016 qubit system, requiring a 
physically intractable 21?"16 bit classical computer to emulate. If only the microtubules inside a 
cell act as a quantum computing network, the emulation would have to include 10" 
connected 130,000 qubit quantum computers. Another calculation, assuming merely classical 
computation in microtubules, suggests 10?? bytes per brain operating at 1028 FLOPS 
(Tuszynski, 2006). One problem with these calculations is that they impute such a profoundly 
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large computational capacity at a subneural level that a macroscopic brain seems unnecessary 
(especially since neurons are metabolically costly). 


Analog computation 


A surprisingly common doubt expressed about the possibility of simulating even simple 
neural systems is that they are analog rather than digital. The doubt is based on the 
assumption that there is an important qualitative difference between continuous and discrete 
variables. 


If computations in the brain make use of the full power of continuous variables the brain may 
essentially be able to achieve “hypercomputation”, enabling it to calculate things an ordinary 
Turing machine cannot (Ord, 2006; Siegelmann and Sontag, 1995). See (Zenil and Hernandez- 
Quiroz, 2007) for a review of how different brain computational architectures would enable 
different levels of computational power, and the requirements for neural networks to 
simulate these. 


However, brains are made of imperfect structures which are, in turn, made of discrete atoms 
obeying quantum mechanical rules forcing them into discrete energy states, possibly also 
limited by a space-time that is discrete on the Planck scale (as well as noise, see below) and so 
it is unlikely that the high precision required of hypercomputation can be physically realized 
(Eliasmith, 2001). Even if hypercomputation were physically possible, it would by no means 
be certain that it is used in the brain, and it might even be difficult to detect if it were (the 
continual and otherwise hard to explain failure of WBE would be some evidence in this 
direction). However, finding clear examples of non-Turing computable abilities of the mind 
would be a way of ruling out Turing emulation. 


A discrete approximation of an analog system can be made arbitrarily exact by refining the 
resolution. If an M bit value is used to represent a continuous signal, the signal-to-noise ratio 
is approximately 20 logio(2™) dB (assuming uniform distribution of discretization errors, 
which is likely for large M). This can relatively easily be made smaller than the natural noise 
sources such as unreliable synapses, thermal, or electrical noise. The thermal noise is on the 
order of 4.2:107! J, which suggests that energy differences smaller than this can be ignored 
unless they occur in isolated subsystems or on timescales fast enough to not thermalize. Field 
potential recordings commonly have fluctuations on the order of millivolts due to neuron 
firing and a background noise on the order of tens of microvolts. Again this suggests a limit 
to the necessary precision of simulation variables*. 


Determinism 


A somewhat related criticism is the assumed determinism of computers, while the brain is 
assumed either to contain true randomness or a physically indeterministic element (often 
declared to be “free will"). 


The randomness version of the determinism criticism can be met by including sufficient noise 
in the simulation. Random events may play a role in the function of the nervous system. In 
particular, the number of individual molecules involved in transcription regulation of some 
proteins and in the function of synaptic spines is low enough that individual random 
interactions can affect whether a gene is switched on or whether a phosphorylation cascade 





4 Analog computation may still be a useful hardware paradigm under some conditions. 
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happens. This randomness may have biological importance, and is sometimes included in 
biophysical models. Another possible role of noise might be stochastic resonance, where noise 
acts to increase the signal-to-noise ratio in nonlinear (and information-theoretically 
suboptimal) systems attempting to detect a weak periodic input (Gammaitoni, Hanggi et al., 
1998). It has been demonstrated in mechanoreceptors (Douglass and Wilkens, 1998) and 
various other forms of sensory systems (Douglass and Wilkens, 1998) as a way of detecting 
faint signals. However, while the exact spectral distribution and power may matter for an 
emulation, it does not seem that the resonance effect would be disrupted by using 
pseudorandom noise if the recurrence time is long enough. Unless there are some important 
“hidden variables" in the noise of the brain, noise can be easily approximated using a suitably 
long-periodic random number generator (Tegmark, 2000) or even by means of an attached 
physical random number generator using quantum mechanics (Stefanov, Gisin et al., 2000). 
Randomness is therefore highly unlikely to pose a major obstacle to WBE. 


Hidden variables or indeterministic free will appear to have the same status as quantum 
consciousness: while not in any obvious way directly ruled out by current observations, there 
is no evidence that they occur or are necessary to explain observed phenomena. 


Summary 


Table 4 shows an overview of informal estimates of the likeliehood that certain features are 
needed for WBE and whether they would pose serious implementation problems if needed. 


Table 4: Likeliehood estimates of modelling complications. 





















































Likeliehood needed for WBE Implementation problems 

Spinal cord Likely Minor. Would require scanning 
some extra neural tissue. 

Synaptic adaptation Very likely Minor. Introduces extra state- 
variables and parameters that need 
to be set. 

Currently unknown Very likely Minor. Similar to known 

neurotransmitters and transmitters and modulators. 

neuromodulators 

Currently unknown ion channels Very likely Minor. Similar to known ion 
channels. 

Volume transmission Somewhat likely Medium. Requires diffusion models 
and microscale geometry. 

Body chemical environment Somewhat likely Medium. Requires metabolomic 
models and data. 

Neurogenesis and remodelling Somewhat likely Medium. Requires cell mechanics 
and growth models. 

Glia cells Possible Minor. Would require more 
simulation compartments, but likely 
running on a slower timescale. 

Ephaptic effects Possible Medium. Would require relatively 
fine-grained EM simulation. 

Dynamical state Very unlikely Profound. Would preclude most 
proposed scanning methods. 

Quantum computation Very unlikely Profound. Would preclude currently 
conceived scanning methods and 
would require quantum computing. 

Analog computation Very unlikely Profound. Would require analog 
computer hardware. 

True randomness Very unlikely Medium to profound, depending on 
whether merely "true" random noise 
or "hidden variables" are needed. 
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Scanning 


The first step in brain emulation is to acquire the necessary information from a physical brain. 
We will call this step "scanning". 


Brain emulation for compartment models of neuron activity needs to acquire both geometric 
data about the localization and morphology of the nervous connections and 
functional/chemical data about their nature such as what ion channels, receptors, and 
neurotransmitters are present, the presence of electrical synapses, electrical membrane 
properties, phosphorylation states of synapses, and genetic expression states. This must be 
done at a sufficient resolution. It may be possible to infer some functional properties such as 
whether a synapse is excitatory or inhibitory purely from geometry (e.g., a synapse from a 
smooth neuron with a particular morphology is likely inhibitory). Yet it remains unclear how 
much information about synaptic strength and neuromodulator type can be inferred from 
pure geometry at a given level of resolution. 


There are several potential approaches to scanning. Here we focus on destructive scanning, in 
which the brain is destructively disassembled during the emulation process. The process 
could be applied immediately after death or on cryogenically preserved brain tissue. This is 
the technologically simplest approach. Destructive scanning has greater freedom both in 
physical effects used, energy levels, and fixating the brain through freezing and/or chemical 
fixation. Possible alternatives include non-destructive scanning and gradual replacement; 
these are discussed in Appendix E. 


Most methods discussed in this section require the sectioning of the brain into smaller pieces 
that can be accurately scanned. The sectioning methods and the handling of the tissue may 
pose significant problems and require the development of special handling technology. In 
particular, sample distortion and drift are key problems; the first requiring very careful 
sectioning methods, the second either high-precision equipment or embedding of fiducial 
markers. 


MRI microscopy 


Although MRI imaging may not be suitable for scanning entire brains at sufficient resolution, 
MRI microscopy might be suitable for scanning parts of them if water diffusion is stopped by 
freezing or fixation. 


3D MR microscopy has achieved resolutions on the order of 3.7x3.3x3.3 um (Ciobanu, Seeber 
et al., 2002) but is limited by the small field of view and long exposure times. One way of 
improving the field of view may be parallelism, where an array of MRI coils is moved over 
the surface (McDougall and Wright, 2007). 


A combination of MRI and AFM is magnetic resonance force microscopy (MRFM) where a 
magnetic bead is placed on an ultra thin cantilever and moved over the sample (or the 
reverse). By generating RF magnetic fields the spins of nuclei within the resonant slice 
beneath the bead can be flipped, producing forces on the cantilever that can be detected. This 
would enable identification of the molecules present near the surface. Current resolutions 
achieved are 80 nm voxels in a scan volume of 0.5 um? (Chao, Dougherty et al., 2004) and 
single spin detection with 25 nm resolution in one dimension (Rugar, Budakian et al., 2004). 
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Whether this can be scaled up to detecting e.g. the presence of cell membranes or particular 
neurotransmitters remains to be seen. 


Optical methods 
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Figure 12: Confocal microscopy picture of visual cortex neurons. Stained with GABA- 
channel antibody (red) and containing neurons expressing green fluorescent protein. The 


white scale bar is 100 um. (Lee, Huang et al., 2006). 


Optical microscopy methods are limited by the need for staining tissues to make relevant 
details stand out and the diffraction limits set by the wavelength of light (40.2 um). The main 
benefit is that they go well together with various spectrometric methods (see below) for 
determining the composition of tissues. 


Sub-diffraction optical microscopy is possible, if limited. Various fluorescence-based methods 
have been developed that could be applicable if fluorophores could be attached to the brain 
tissue in a way that provided the relevant information. Structured illumination techniques 
use patterned illumination and post-collection analysis of the interference fringes between the 
illumination and sample image together with optical nonlinearities to break the diffraction 
limit. This way, 50 nm resolving power can be achieved in a wide field, at the price of 
photodamage due to the high power levels (Gustafsson, 2005). Near-field scanning optical 
microscopy (NSOM) uses a multinanometre optic fiber to scan the substrate using near-field 
optics, gaining resolution (down to the multi-nanometer scale) and freedom from using 


41 


fluorescent markers at the expense of speed and depth of field. It can also be extended into 
near field spectroscopy. 


Confocal microscopy suffers from having to scan through the entire region of interest and 
quality degrades away from the focal plane. Using inverse scattering methods depth- 
independent focus can be achieved (Ralston, Marks et al., 2007). 


Optical histology and dissection 

All-optical histology uses femtosecond laser pulses to ablate tissue samples, avoiding the 
need for mechanical removal of the surface layer (Tsai, Friedman et al., 2003). This treatment 
appears to change the tissue 2-10 um from the surface. However, Tsai et al. were optimistic 
about being able to scan a fluorescence labelled entire mouse brain into 2 terapixels at the 
diffraction limit of spatial resolution. 


Another interesting application of femtosecond laser pulses is microdissection (Sakakura, 
Kajiyama et al., 2007; Colombelli, Grill et al., 2004). The laser was able to remove 100 uum 
samples from plant and animal material, modifying a ~10 um border. This form of optical 
dissection might be an important complement for EM methods, in that, after scanning the 
geometry of the tissue at a high resolution, relevant pieces can be removed and analyzed 
microchemically. This could enable gaining both the EM connectivity data and detailed 
biochemistry information. Platforms already exist that can both inject biomolecules into 
individual cells, perform microdissection, isolate and collect individual cells using laser 
catapulting, and set up complex optical force patterns (Stuhrmann, Jahnke et al., 2006). 


Knife-Edge Scanning Microscopy (KESM) 






Figure 13: Knife Edge-Scanning Microscope. Microscope objective (left), diamond knife 
(right) cutting the specimen in the center. Note the illumination refracted through the 
knife (Copyright Brain Networks Laboratory, Texas A&M University). 
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KESM is a method for staining and imaging large volumes at high resolutions by integrating 
the sectioning and imaging step. It was developed for reconstruction and modelling of the 
three-dimensional anatomical structure of individual cells in situ. While having lower 
resolution than SBF-SEM (see below) it enables the imaging of an entire macroscopic tissue 
volume such as a mouse brain in reasonable time (McCormick, 2002a; McCormick, 2002b). 


The KESM acts simultaneously as a microtome and a microscope. A diamond knife is used 
for the cutting and illumination, moving together with the microscope across a stationary 
tissue specimen embedded in plastic. Imaging occurs of the just-separated section along a 1D 
strip. Imaging at 0.3 um resolution of 0.5 um thick sections has been demonstrated, 
producing datasets where microvasculature and individual cells can be traced using 
contouring algorithms. The instrument scans at rates up to 200Mpixels/s. This would enable 
scanning a mouse brain (21 cm?) in a hundred hours, giving a 15 terabyte dataset of 
uncompressed data (Mayerich, Abbott et al., 2008). The width of field is 2.5 mm for 64 um 
resolution and 0.625 mm for 32 uum resolution (McCormick, Koh et al., 2004). In order to 
reduce jittering and using knives larger than the microscope field of view, the sample is cut in 
a stair-step manner (Koh and McCormick, 2003). 


The main limitation is the need for staining inside a volume. At present the number of such 
stains is limited. Using transgenic animals expressing stains or fluorescence could enable 
mapping of different neural systems. 


X-Ray microscopy 
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Figure 14: X-ray microtomography stereo image of the supraesophageal ganglion in 
Drosophilia, showing individual neurons. Scale 50 um. (Mizutani, Takeuchi et al., 2007) 


X-ray microscopy is intermediate between optical microscopy and electron microscopy in 
terms of resolution. 


Hard X-ray microtomography has been used to image the larval Drosophila brain (Mizutani, 
Takeuchi et al., 2007). This was achieved by staining neurons with metal and then acquiring 
images rotated by 0.12° with 300 ms exposure, with a total acquisition time of 1800 seconds. 
The field of view was on the order of 0.5 mm? with a pixel resolution of 0.47x0.47 um; the 
resulting tomographical resolution was about 1 um in each direction. Individual neurons 
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could be identified, and the authors estimate that the setup could in principle scan a 1 mm? 
block of mammalian nerve tissue with 10-104 neurons. However, the resolution is likely too 
crude to identify synapses and connectivity. 


By using soft X-rays with wavelengths in the ^water window" (between the absorption edges 
of oxygen and carbon at 2.3-4.4 nm) where carbon absorbs ten times as strongly as water, 
organic structures can be imaged with good contrast down to at least 30 nm resolution. 
Advantages include that it can be done without special sample preparation procedures and 
can be applied to hydrated cells, it permits thicker specimens than EM (up to 10 um), X-ray 
absorption spectra of localized regions can be recorded by using different X-ray energies, and 
potentially three-dimensional imaging could be possible using multiple beams (Yamamoto 
and Shinohara, 2002). 


A major disadvantage is that the X-rays cause tissue damage. When the total X-ray flux goes 
above 4-105 photons per um?, myofibrils lose their contractility; and above 106 photons per 
um?, yeast cells lose their dye exclusion ability (a sign that they are dead or punctured). This 
is lower than the fluxes of X-ray microscopes with 50 nm resolution, making scanning of 
tissue while retaining its function impossible (Fujisaki, Takahashi et al., 1996). One way of 
avoiding degradation of the image is to use a rapid pulse of rays. This also avoids motion 
blurring due to diffusion in the case of hydrated tissue: to image freely diffusing objects of 
size 10 nm exposure shorter than 0.1 ms is needed, while a cellular organelle would require 
between 14 ms — 1.4 s depending on size (Ito and Shinohara, 1992). Using vitrified tissue the 
radiation dose can be increased thousandfold without structural changes, and the diffusion 
issue disappears (Methe, Spring et al., 1997). These concerns make X-ray imaging unlikely to 
work for non-destructive scanning. 


Of particular interest for WBE is the possibility of doing spectromicrosopy to determine the 
local chemical environment. X-ray absorption edges are affected by the chemical binding state 
of the atom probed. Different amino acids have distinguishable 'fingerprints' that are 
relatively unaffected by peptide bonding. In principle, the spectra of proteins could be 
predicted based on their sequences. Scanning X-ray microscopes have long exposure times 
(minutes), although they deposit 5-10 times less radiation in the sample (Jacobsen, 1999). 
While currently far too slow for WBE purposes, finding ways of speeding up this process 
may make it relevant for WBE. 


Atomic-beam microscopy 


Instead of photons or electrons, neutral atoms could be used to image a sample. Since the de 
Broglie wavelength of thermal atoms is subnanometer, the resolution could be very high. By 
using uncharged and inert atoms like helium, the beam would also be non-destructive (Holst 
and Allison, 1997). Helium atom scattering has a large cross-section with hydrogen, which 
might make it possible to detect membranes in unstained tissue. 


At present, imaging at high resolution using atomic beam microscopy has not been achieved 
(lower resolution has been achieved (Doak, Grisenti et al., 1999)). Development of ridged 
atomic mirrors has enabled focusing neutral atom beams (Oberst, Kouznetsov et al., 2005; 
Shimizu and Fujita, 2002) and would in principle allow focusing the beam to a spot size of 
tens of nanometers (Kouznetsov, Oberst et al., 2006); by scanning the spot across the sample 
an image could be built up like in other forms of scanning microscopy. 
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Electron microscopy 






ES 


Figure 15: EM picture of rat hippocampus neuropil. D marks a dendrite of a pyramidal cell. 
Several synapses can be seen to the left, noticeable by the presence of small spherical 
vesicles on the presynaptic side and a dark postsynaptic density on the receiving side. The 


scale bar is 1 um. (Copyright J Spacek, Synapse Web) 





Electron microscopy can resolve the fine details of axons and dendrites in dense neural tissue. 
Images can be created through transmission electron microscopy (TEM), where electrons are 
sent through tissue, or scanning electron microscopy (SEM) where electrons are scattered 
from the surface: both methods require fixing the sample by freezing and/or embedding it in 
polymer. TEM has achieved 0.1 nm imaging (Nellist, Chisholm et al., 2004). However, the 
main challenge is to automate sectioning and acquisition of data. The three current main 
methods are serial section electron tomography (SSET), serial section transmission electron 
microscopy (SSTEM) and serial block-face scanning electron microscopy (SBFSEM) 
(Briggman and Denk, 2006). Two new methods that may be useful are FEI’s DualBeam 
Focused Ion Beam SEM (FIBSEM) and automatic ultrathin sectioning and SEM imaging 
(ATLUM). 


SSET: High resolution TEM 3D images can be created using tilt-series-based tomography 
where the preparation is tilted relative to the electron beam, enabling the recording of depth 
information (Frank, 1992; Penczek, Marko et al., 1995). This method appears suited mainly for 
local scanning (such as imaging cellular organelles) and cannot penetrate very deep into the 
surface (around 1 um) (Lučić, Forster et al., 2005). 


SSTEM: Creating ultrathin slices for TEM is another possibility. (Tsang, 2005) created a three- 


dimensional model of the neuromuscular junction through serial TEM of 50 nm sections 
created using an ultramicrotome. (White, Southgate et al., 1986) used serial sections to 
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reconstruct the C. elegans nervous system. However, sectioning is physically tricky and labor 
intensive. 


SBFSEM: One way of reducing the problems of sectioning is to place the microtome inside 
the microscope chamber (Leighton, 1981); for further contrast, plasma etching was used 
(Kuzirian and Leighton, 1983). (Denk and Horstmann, 2004) demonstrated that 
backscattering contrast could be used instead in a SEM, simplifying the technique. They 
produced stacks of 50-70 nm thick sections using an automated microtome in the microscope 
chamber, with lateral jitter less than 10 nm. The resolution and field size was limited by the 
commercially available system. They estimated that tracing of axons with 20 nm resolution 
and S/N ratio of about 10 within a 200 um cube could take about a day (while 10 nm x 10 nm 
x 50 nm voxels at S/N 100 would require a scan time on the order of a year). 


Reconstructing volumes from ultrathin sections faces many practical challenges. Current 
electron microscopes cannot handle sections wider than 1-2 mm. Long series of sections are 
needed but the risk of errors or damage increase with the length, and the number of specimen 
holding grids becomes excessive (unless sectioning occurs inside the microscope (Kuzirian 
and Leighton, 1983)). Current state of the art for practical reconstruction from tissue blocks is 
about 0.1 mm? , containing about 107-108 synapses (Fiala, 2002). 


FIBSEM*: The semiconductor industry has long used focused ion beams (FIB) (usually 
accelerated beams of gallium ions) to very precisely cutaway parts of integrated circuit chips 
for failure analysis. Researchers at FEI have recently demonstrated that this technique and 
instrument can also be applied to plastic embedded neural tissue in a manner similar to the 
SBFSEM above (Mulders, Knott et al., 2006). In the FIBSEM, the top 30-50 nm layer of a block 
of tissue (having block face dimensions of approximately 100 x 100 um) is ablated away using 
the focused ion beam. The resulting block face is then imaged with the SEM using the 
backscatter signal just as is done in the SBFSEM, and this process is repeated to image 
hundreds of successive layers. 


Because the FIBSEM ablates away the block's top layer using an ion beam (instead of a 
diamond knife), the 'cooking' of the block surface that occurs in SBFSEM due to the high 
beam current is not a problem. Using longer integration times, FIBSEM images have 
demonstrated lateral resolutions of 5nm or better and much improved signal to noise ratios. 


ATLUM: The above SSET and SSTEM approaches use sections obtained by diamond knife 
sectioning on a traditional semi-automated ultramicrotome. This process is only semi- 
automated because the collection of these sections requires manually scooping free-floating 
sections from the knife's water boat onto TEM slot grids using an eyelash. This is an 
inherently unreliable process for all but the smallest volumes. 


The Automatic Tape-Collecting Lathe Ultramicrotome (ATLUM) is a prototype machine 
recently developed at Harvard University that fully automates the process of ultrathin 
sectioning and collection (Hayworth, Kashturi et al., 2006; Hayworth, 2007). In the ATLUM a 
tissue sample (typically 1-2mm in width, 10mm long, and 0.5mm deep) is mounted on a steel 
axle and is rotated continuously via an ultraprecise airbearing while a piezo-driven diamond 
knife slices off one ultrathin section per revolution. A feedback loop employing capacitive 
sensors maintains the position of the knife relative to the axle to within approximately 10nm 
allowing even large area tissue sections (many square millimeters) to be cut to thicknesses 


5 Text derived from Kenneth Hayworth. 
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below 40nm. Each section comes streaming off the knife's edge onto the surface of a water 
boat attached behind the knife. In traditional ultramicrotomy this fragile floating section 
would have to be collected manually (via dexterous use of an eyelash) onto a TEM slot grid, a 
notoriously unreliable process. In contrast, in the ATLUM each section is immediately 
collected by the mechanism onto a submerged conveyor belt of carbon coated Mylar tape (100 
um thick and 8 mm wide). In this way, hundreds of large-area ultrathin sections are 
automatically secured onto a meters-long Mylar tape. 


This tissue ribbon is 
collected by a 
submerged conveyor belt 
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Figure 16: Overview of ATLUM operation. (Copyright Lichtman Lab, Harvard University) 





This tape is subsequently stained with heavy metals and imaged in a SEM. Imaging via the 
SEM backscatter signal (like the SBFSEM and FIBSEM approaches above) allows the ATLUM 
to dispense with fragile TEM slot grids (which also removes their width-of-section 
limitation). The Mylar tape provides a sturdy substrate for handling and storage while its 
carbon coating prevents charging and beam damage during SEM imaging. 


Images obtained from ATLUM tissue tapes are of equivalent quality to traditional TEM 
images showing lateral resolution better than 5 nanometers. Recent tests have also shown that 
a tomographic tilt series (as in the SSET technique) may be performed on ATLUM collected 
sections to obtain additional depth information. This is performed by tilting the section at 
various angles with respect to the SEM's electron beam thus providing Z-resolutions even 
finer than the section thickness itself. Because the sections are stored for later imaging they 
can be post-stained with heavy metals (producing images with superior signal to noise ratios 
and much faster imaging times than the SBFSEM and FIBSEM block face imagers) and, if 
needed, they can be poststained with a series of other stains for overlaying chemical analysis 
maps on the high resolution SEM structural images. 
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Comparison of techniques 


Resolution 

With the exception of the SBFSEM, all of these techniques have already demonstrated 
sufficient resolution to map the exact point-to-point connectivity of brain tissue down to the 
level of counting synaptic vesicles in individual synapses. With additional modifications it 
may be possible for the SBFSEM to also reach these resolutions. 


Reliability and robustness 

Transmission EM techniques like the SSTEM and SSET require sections to be mounted on slot 
grids. The process of collecting sections on slot grids has not yet been successfully automated, 
and sections mounted on slot grids are themselves extremely fragile (because they are 
essentially freely supported structures less than a micron thick). This raises serious doubts as 
to the feasibility of scaling these techniques up to very large volumes of neural tissue 
(although machines like the ATLUM may be modifiable to allow automated collection of 
sections on tapes with prefabricated slots, or alternatively slots can be created in the tape after 
collection is completed). SEM backscatter imaging seems to offer equivalent resolution and 
image quality while avoiding these pitfalls intrinsic to TEM approaches. 


One remaining possible advantage of TEM approaches is imaging speed. Images in current 
SEMs are built up one pixel at a time as the electron beam scans across the sample's surface. 
In contrast, the pixels in TEM images are captured in parallel by a CCD + scintillator mounted 
below the section. This fact could, in principle, allow imaging times orders of magnitude 
faster than SEM imaging. However, many technical issues complicate this comparison. In 
practice, current TEMs image sections only slightly faster than SEMs do. In addition, 
multibeam SEMs (described below) are currently being developed that will massively 
parallelize, and thus speed up, SEM image acquisition 


Block face approaches (SBFSEM and FIBSEM) have inherent reliability since they avoid the 
perilous step of collecting ultrathin sections by simply destroying them in situ after having 
already imaged them. The FIBSEM technique is additionally, at least in principle, robust to 
small differences in embedding quality. All of the other techniques use diamond knife 
sectioning which requires good uniformity of resin infiltration and resin hardness throughout 
the tissue block to section smoothly. Of course tissue samples for all the techniques must be 
infiltrated correctly for proper ultrastructural preservation 


While the ATLUM is necessarily less reliable than block face approaches (since it collects 
ultrathin sections for later imaging) it still has the potential for extreme reliability over large 
volumes. This is because the ultrathin sections, within the ATLUM mechanism, are secured to 
the sturdy Mylar tape almost immediately after they are sectioned by the diamond knife. In 
fact, the leading edge of each section is secured to the collection tape while the trail edge is 
still being sectioned by the knife. In this way each section is always under complete control of 
the mechanism. 


Imaging time 

For concreteness, let's outline a reasonable near-term neuronal circuit mapping goal 
necessary to support a set of "partial" brain circuit emulation experiments. We can then 
compare the imaging times, and thus the feasibility, of the different methods. One of the most 
interesting and best studied pieces of brain circuitry is that underlying orientation tuning in 
the primary visual cortex (V1) in the cat and in the primate. This circuit's crude functional 
properties (orientation tuned 'simple' and 'complex' cells) have been known for decades yet 
the circuitry underlying these functional responses remains a topic of heated debate 


48 


generating dozens of journal articles every year. More importantly, our lack of a detailed 
understanding of the neuronal circuitry has blocked progress into V1's more subtle 
computational properties such as its role in perceptual grouping. 


A trial brain emulation experiment might set as a goal the modelling of orientation tuning 
responses of a few cells in V1 (possibly correlated with previous two-photon recording of 
activity within the same tissue). This task would not require entire volumes of complex 
neuropil be traced in total. It would require, however, imaging the connectivity of at least 
several dozen neurons in the various layers of V1, and their connectivity both with each other 
and with many dozen axonal projections to V1 from the lateral geniculate nucleus (LGN). In 
addition, these axonal projections from the LGN would need to be traced back to their origins 
to determine their relative positions and thus to infer their own receptive field properties. In 
whole, such a study would require nanoresolution tracing of parts of over one hundred 
neurons and thousands of synaptic connections spanning a volume on the order of 10mm? 
(This is assuming that the myelinated projections between LGN and V1 can be traced at much 
lower resolution, excluding it from the volume estimate). Because of the nature of the study, 
such neurons could not be selectively labelled beforehand and so the electron microscopic 
methods described in this section (which can image any arbitrary neuronal cell or process) are 
the only current techniques which may be capable of performing the task. 


SSET and SSTEM: As discussed above, the manual collection of TEM-ready sections (for 
techniques like the SSET and SSTEM) is out of the question for volumes as large as 10mm?. 


SBFSEM and FIBSEM: So far the largest volumes sectioned in these devices have been less 
than 0.01 mm8, with blockface dimensions typically 200 x 200 um or less. Imaging rate on 
these devices is around 100kHz (10 microseconds per pixel). In general, blockface imaging 
approaches necessitate such slow imaging rates (to obtain adequate signal to noise ratios) 
since the material can only be lightly stained while in block form. 


Our V1 circuit tracing scenario does not strictly require the entire 10mm? volume be imaged 
at highest resolution; however, as the neuronal circuits to be traced wander randomly 
throughout the entire volume, one does not know a priori which parts to image at high 
resolution. Because, in these blockface imaging devices, each section is destroyed in situ, 
there is only one chance to image it and thus in effect all parts of the volume must be imaged 
at sufficient resolution to trace the finest neuronal processes. Assuming 5 nm lateral 
resolution and 50 nm sections (typical values for neuropil tracing studies) the 10 mm? block 
consists of 8-1015 voxels and would require on the order of 2,000 years to image! Such long 
imaging time makes this tracing study infeasible on these machines. 


ATLUM: The largest volume sectioned on the ATLUM so far is 0.1mm? with blockface 
dimensions of 1.4mm x 4.5mm and section thickness of 45nm (total of 400 sections). ATLUM 
cutting speed is around 0.03mm/sec and each of these ultrathin sections (6.3mm? in area) was 
produced at a rate of approximately 4 minutes per section. If such a rate could be reliably 
sustained it would take about 3 months for an ATLUM-like device to reduce a 10mm»? tissue 
block to a series of 50nm sections each securely mounted on a long Mylar tape ready for 
imaging. 


Because the ATLUM collects sections for later imaging, its sections can be poststained with 
heavy metals. This allows adequate signal to noise ratios with shorter image acquisition 
times. Imaging rate is around 1MHz (1.0 microsecond per pixel) so bulk imaging the entire 
10mm? would require on the order of 200 years. This still would render the project infeasible; 
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however, unlike the block face techniques, the ATLUM technique can take advantage of 
directed high-resolution imaging to dramatically shorten this imaging time. 


As stated above, at current sectioning speeds, the ATLUM could reduce a 10mm»? block of 
brain tissue to a tape of ultrathin sections in just a few months. Such a tape would constitute a 
permanent “ultrathin section library" of the entire volume, and any part of it could be 
randomly imaged (and re-imaged) at whatever resolution desired. A researcher could 
coarsely image the entire volume in a few days, find a target neuron within the volume, and 
then subsequently direct high-resolution imaging exactly where it is needed to trace that 
neuron's processes and its connections with additional neurons. In this way, a multi-neuron 
circuit, stretching throughout the entire 10mm? volume, could be mapped with synaptic 
precision while only imaging perhaps one one-thousandth of the whole volume. In this 
fashion, the ATLUM can potentially reduce the time needed to trace a complete neural circuit 
from hundreds of years to just a few months, making the V1 trial emulation experiment 
potentially feasible in the near-term. 


Parenthetically, we can also note that the near-term goal of tracing and emulating circuits of a 
few hundred neurons drastically reduces the image processing and computer simulation 
requirements. Instead of requiring millions of neurons with billions of synapses to be traced 
and identified, only the limited subset under study need be traced in the image data. The 
successful completion of "partial brain" emulation studies (such as the hypothetical V1 study 
above) is probably a necessary precursor to spur the type of research and investment in 
sectioning, imaging, tracing, and emulations technologies needed for eventually scaling-up to 
whole brain emulation levels. 


Possibilities for increasing SEM imaging speed 

From the above discussion it is clear that long imaging times constitute a major barrier to 
whole brain emulation using SEM techniques. However, there is currently a major research 
push toward massively parallel multi-beam SEMs which has the potential to speed up SEM 
imaging by many orders-of-magnitude. This research push is being driven by the 
semiconductor industry as part of its effort to reduce feature sizes on computer chips below 
the level that traditional photolithography can produce. 


The circuitry patterns within computer chips are produced through a series of etching and 
doping steps. Each of these steps must affect only selected parts of the chip, so areas to be left 
unaffected are temporally covered by a thin layer of polymer which is patterned in exquisite 
detail to match the sub-micron features of the desired circuitry. For current mass production 
of chips this polymer layer is patterned by shining ultraviolet light through a mask onto the 
surface of the silicon wafer which has been covered with the photopolymer in liquid form. 
This selectively cures only the desired parts of the photopolymer. To obtain smaller features 
than UV light can allow, electron beams (just as in a SEM) must instead be used to selectively 
cure the photopolymer. This process is called e-beam lithography. Because the electron beam 
must be rastered across the wafer surface (instead of flood illuminating it as in light 
lithography) the process is currently much too slow for production level runs. 


Several research groups and companies are currently addressing this speed problem by 
developing multi-beam e-beam lithography systems (Kruit, 1998; van Bruggen, van Someren 
et al., 2005; van Someren, van Bruggen et al., 2006; Arradiance Inc). In these systems, 
hundreds to thousands of electron beams raster across a wafer's surface simultaneously 
writing the circuitry patterns. These multi-beam systems are essentially SEMs, and it should 
be a straightforward task to modify them to allow massively parallel scanning as well 
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(Pickard, Groves et al., 2003). For backscatter imaging (as in the SBFSEM, FIBSEM, and 
ATLUM technologies) this might involve mounting a scintillator with a grid of holes (one for 
each e-beam) very close to the surface of the tissue being imaged. In this way the interactions 
of each e-beam with the tissue can be read off independently and simultaneously. 


Itis difficult to predict how fast these SEMs may eventually get. A 1,000 beam SEM where 
each individual beam maintains the current 1 MHz acquisition rate for stained sections 
appears reachable within the next ten years. We can very tentatively apply this projected SEM 
speedup to ask how long imaging a human brain would take. First, assume a brain were 
sliced into 50nm sections on ATLUM-like devices (an enormous feat which would itself take 
approximately 1,000 machines — each operating at 10x the current sectioning rate — a total of 
3.5 years to accomplish). This massive ultrathin section library would contain the equivalent 
of 1.110?! voxels (at 5x5x50 nm per voxel). Assuming judicious use of directed imaging 
within this ultrathin section library only 1/10 may have to be imaged at this extremely high 
resolution (using much lower, and thus faster, imaging on white mater tracts, cell body 
interiors etc.). This leaves roughly 1.110? voxels to be imaged at high resolution. If 1,000 
SEMs each utilizing 1,000 beamlets were to tackle this imaging job in parallel their combined 
data acquisition rate would be 1-10? voxels per second. At this rate the entire imaging task 
could be completed in less than 4 years. 


Nanodisassembly 


The most complete approach would be to pick the brain apart atom by atom or molecule by 
molecule, recording their position and type for further analysis. The scenario in (Morevec, 
1988) can also be described as nanodisassembly (in an unfixated brain, with on-the-fly 
emulation) working on a slightly larger size scale. (Merkle, 1994) describes a relatively 
detailed proposal where the brain is divided into 3.2:1015 0.4 um cubes where each cube 
would be disassembled atomically (and atom/molecule positions recorded) by a disassembler 
nanodevice (Drexler, 1986) over a three year period. 


It has been pointed out that medical nanorobotics is the form of nanotechnology best used for 
non-destructive scanning. Given that no detailed proposal for a nanodisassembler has been 
made it is hard to evaluate the chances of nanodisassembly. It would have to act at a low 
temperature to prevent molecules in the sample from moving around, removing surface 
molecules one by one, identifying them and transmitting the position, orientation and type to 
second-line data aggregators. Clear challenges are the construction of tool tips that can extract 
arbitrary molecules or detect molecular type for further handling with specialized tool tips, as 
well as handling macromolecules and fragile molecular structures. Macromolecules can likely 
not be pulled out in one piece. Atomic disassembly would avoid the complications of 
molecules for the greater simplicity of a handful of atom types, at the price of needing to 
break molecular bonds, the risk of ensuing rearrangements and the possibility of creating 
reactive free radicals. Mature productive nanosystems would appear to be a necessary 
precursor technology for this approach (probably for the production of the massively parallel 
disassembly system required for controlled atomic disassembly of a macroscopic object) 
(Foresight Nanotech Institute and Batelle Memorial Institute, 2007). It appears likely that the 
kind of technology needed for disassembly would be significantly more complex than the one 
needed for atomically precise assembly. 


6 Robert Freitas Jr., personal communication. 
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Chemical analysis 


A key challenge is to detect the chemical state and type of cellular components. Normally this 
is done by staining with dyes or quantum dots that bind to the right target, followed by 
readout using optical methods. Beside the need for diffusing dyes through the samples, each 
dye is only selective for a certain target or group of targets, necessitating multiple dyes for 
identifying all relevant components. If the number of chemicals that have to be identified is 
large, this would make dyeing ineffective. 


One possible approach is Raman microspectroscopy (Krafft, 2004; Krafft, Knetschke et al., 
2003), where near-infrared scattering is used to image the vibration spectrum of the chemical 
components (mainly macromolecules) of tissue. The resolution for near infrared spectroscopy 
is about 1 um (limited by diffraction) and confocal methods can be used for 3D imaging. 
Recording times are very long, on the order of minutes for individual pixels; in order to be 
useful for WBE this has to be speeded up significantly or parallelized. Using shorter 
wavelengths appears to induce tissue damage (Puppels, Olminkhof et al., 1991), which may 
be of little concern for destructive scanning. Ultraviolet resonance microspectroscopy has also 
been used, enabling selective probing of certain macromolecules (Pajcini, Munro et al., 1997; 
Hanlon, Manoharan et al., 2000). In some cases native fluorescence can enable imaging by 
triggering it with UV light, laser-induced native fluorescence, LINF, such as in the case of 
serotonin (Tan, Parpura et al., 1995; Parpura, Tong et al., 1998) and possibly dopamine 
(Mabuchi, Shimada et al., 2001). 


A new method with great promise is array tomography, a hybrid between optical 
fluorescence imaging and electron microscopy (Micheva and Smith, 2007). Samples are cut 
into ultrathin (50-200 nm thick) sections forming an ordered array on a glass slide. The array 
is then labelled with fluorescent antibodies or other stains and imaged, generating a 3D 
reconstruction with a depth resolution set by the sectioning thickness. After this imaging the 
array can be eluted to remove the stain, restained with a new stain, imaged and so on. Finally 
it can be stained with metal and imaged in a scanning electron microscope. The different 
image stacks can then be combined, giving a high resolution 3D reconstruction with chemical 
information. 


At present it looks uncertain how much functionally relevant information can be determined 
from spectra. If the number of neuron types is relatively low and chemically distinct, it might 
be enough to recognize their individual profiles. Adding dyes tailored to disambiguate 
otherwise indistinguishable cases may also help. 


Embedding, fixation and staining techniques 


Most forms of destructive scanning require fixation and embedding the brain to be scanned 
so that it can be handled, sectioned, and imaged well. This process must preserve 
ultrastructure and necessary chemical information. 


Current electron microscopy often relies on osmium tetroxide and other heavy metal stains to 
stain cell membranes for improved contrast (Palay, McGee-Russell et al., 1962). It appears 
likely that unless pure morphology can be used to deduce functional properties, improved 
staining methods are needed to create a link between what can be imaged and what is 
functionally relevant. This might be an area where nanoparticles (and later, more advanced 
nanodevices) will be important. 
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Conclusion 


Itis likely that the scanning methods used to provide raw data for WBE will be, at least in the 


early days, destructive methods making use of sectioning and scanning. 


Assuming that imaging the finest axonal processes and synaptic spines are required (= 50nm), 


this sets a resolution requirement on the order of 5 nm at least in two directions. Comparing 


the methods mentioned and their resolution we get the following categorization: 


Table 5: Scanning methods 
































Method Resolution 

MRI >5.5 um (non-frozen) Does not require sectioning, may 
achieve better resolution on vitrified 
brains. 

MRI microscopy 3 um 

NIR microspectroscopy 1 um 

All-optical histology 0.7 um 

KESM 0.3 um x 0.5 um 

X-ray microtomography 0.47 um 

MRFM 80 nm 

SI 50nm 

X-ray microscopy 30nm Spectromicroscopy possible? 





Required resolution? 
































SBFSEM 50-70 nm x 1-20 nm 

FIBSEM 30-50 nm x 1-20 nm 

ATLUM 40nmx5nm 

SSET 50nmx1nm 

Atomic beam microscopy 10nm Not implemented yet. 

NSOM 5nm? Requires fluorescent markers, 
spectroscopy possible. 

SEM 1-20 nm 

Array tomography 1-20 nm SEM, 50x200x200 nm Enables multiple staining 

fluorescence stains 
TEM <1 nm Basic 2D method, must be combined 








with sectioning or tomography for 
3D imaging. Damage from high 
energy electrons at high resolutions. 





A 5x5x50 nm resolution brain scan requires 1.4-10?! voxels, a large amount of raw data (see 
next section). Destructive scanning working on fixated brains may avoid having to store the 


entire dataset by only storing the most recent slices of brain and perform image processing on 


these. Instead of first scanning the entire brain, and then processing the image data to extract 


the relevant features and neuronal network, the extraction can proceed piecemeal and 


concurrently as new imaging data is collected. Once relevant information has been extracted 


from a data batch, the raw visual data for that batch can be discarded. 


However, ideas such as the single brain physical library (Hayworth, 2002) demonstrate that 


the scanning does not have to be single-shot: a brain is sectioned and fixed in a suitable 
manner for interactive scanning (possibly using several modalities) and retrieval. This also 


gets around the data storage problem by only needing to scan regions of interest and store 


their data (possibly temporarily, since they can be re-scanned if needed later). In early 


neuroinformatics applications, this might include mapping out the long-range connectivity of 


a subset of representative neurons in order to find their morphology and connectivity, only 


requiring imaging a very small subset of the brain's volume (0.0176 for mapping out 100,000 


neurons in a human brain). It may also be possible to combine this approach with methods 


like array tomography (Micheva and Smith, 2007) for combined chemical and structural 


maps. 


53 





Low resolution methods such as optical microscopy are an important test of many aspects of 
the WBE research endeavour, such as large-scale data management, inferring neural function 
from morphology (possibly using spectroscopy or other methods of estimating chemical 
states), and providing early test data for tracing and connectivity reconstruction algorithms. 


If the required level of simulation is at level 5 or above (detailed cellular electrophysiology 
and neuron connectivity: see Table 2) we appear to have already achieved the resolution 
requirements and the remaining problem is in data/tissue management and scaling up 
methods to handle large brains. Using KESM or ATLUM it should be possible in the very 
near future to construct a detailed connectome of the brain, in particular the exact 
interconnections of cortical minicolumns. 


If the required level for WBE is below level 5, increases of imaging resolution are of relatively 
limited use. Rather, we need modalities that enable mapping the proteins, possibly their 
states and the presence of metabolites or RNA transcripts. This would pose a major research 
challenge, although it is in line with much research interest today examining the biophysics 
of cells. 


54 


Image processing and scan interpretation 


The data from the scanning must be postprocessed and interpreted in order to become useful 
for brain emulation (or other research). Cell membranes must be traced, synapses identified, 
neuron volumes segmented, distribution of synapses, organelles, cell types and other 
anatomical details (blood vessels, glia) identified. Currently this is largely done manually: 
cellular membranes can be identified and hand-traced at a rate of 1-2 hours/um? (Fiala and 
Harris, 2001), far too slow for even small cortical volumes. 


Software needed includes (after (Fiala, 2002)): 
e Geometric adjustment (aligning sections, handling shrinkage, distortions) 
e Noise removal 
e Data interpolation (replacing lost or corrupted scan data) 
e Cell membrane tracing (segmentation, tracing in 2D and 3D) 
e Synapse identification 
e Identification of cell types 
e Estimation of parameters for emulation 
e Connectivity identification 
e Databasing 


Data handling is at present a bottleneck. 0.1 mm? at 400 pixels/um resolution and 50 nm 
section thickness would (compressed) contain 73 terabytes of raw data. A full brain at this 
resolution would require 10° terabytes (Fiala, 2002). While extremely large, even this might 
one day be regarded as feasible (see Appendix B). However, as mentioned in the previous 
chapter, for emulation purposes it may be far more practical to perform a sequence of 
scanning and interpretation steps so that only a smaller buffer of high resolution data is 
necessary. 


If the brain is divided into small blocks, each block could be analyzed independently, at least 
in terms of low-level image processing and preliminary tracing. Data from neighbouring 
blocks would need to be compared but there is no need for the analysis of more remote 
blocks (with the possible exception of interpolating lost data). This suggests that it can be 
parallelized to a great degree. As the scan progresses, low-level data can be discarded to 
make room for the compressed high-level representation needed to build the emulation’. 


This section deals with the assumption that WBE is achieved using image-based methods 
rather than correlation analysis methods where e.g. nanomachines record local neural activity 
and estimate connectivity from the correlations in the activity pattern. Correlation methods 
would have corresponding demands for signal processing and inference, but in the time 
domain rather than the space domain. 


Geometric adjustment 


Various methods for achieving automatic registration (correcting differences in alignment) of 
image stacks are being developed. At its simplest, registration involves finding a combination 
of translation, scaling, and rotation that makes subsequent images match best. However, 
skewing and non-linear distortions can occur, requiring more complex methods. Combining 
this with optimization methods and an elastic model to correct for shape distortion produced 


7 This loss of data may be of concern since initial scans are likely to be expensive (and hence hard to repeat), and for 
human brains (which are individually valuable). Some forms of scanning such as ATLUM may produce storable 
‘libraries’ of slices that can be retained for re-scanning or further analysis (Hayworth, 2002). 
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good results with macroscopic stacks of rat and human brains (Schmitt, Modersitzki et al., 
2007). 


Noise removal 


Removing noise from images is part of standard image processing, with an extensive 
literature and strong research interest. In general, noise removal is simplified when the kinds 
of noise introduced by scanning artefacts and the nature of the system are known. As an 
example, see (Mayerich, McCormick et al., 2007) where light variations and knife chatter 
noise were removed from KESM data. 


Data interpolation 


Lost/corrupted data must be replaced with probabilistic interpolations. This might require 
feedback from later stages to find the most likely interpretation or guess, constrained by what 
makes sense given known data. 


For large lost volumes, generic neurons and connectivity might have to be generated based 
on models of morphology and connectivity. The goal would be to avoid changing the 
functionality of the total network. Early results in WBE and statistics from successful imaging 
volumes should provide much useful input in developing this. 


Sectioning the brain into individually scannable chunks will introduce data loss and possible 
misalignment along the edges (for example, the KESM suffers damage up to 5 um in width 
between different columns (Kwon, Mayerich et al., 2008)). This may prove to be the major 
source of lost data in WBE. Since this could cause mis-tracing of long axons crossing 
numerous chunk boundaries, solving this issue is a high priority. Alignment can probably be 
achieved reliably if the lost zone is sufficiently smaller than the correlation length in the 
surrounding images. 
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Cell tracing 





Figure 17: Blood vessel reconstruction from KESM Nissl data. (Copyright brain Networks 
Laboratory, Texas A&M University) 

Automated tracing of neurons imaged using confocal microscopy has been attempted using a 
variety of methods. Even if the scanning method used will be a different approach it seems 
likely that knowledge gained from these reconstruction methods will be useful. 


One approach is to enhance edges and find the optimal joining of edge pixels/voxels to detect 
contours of objects. Another is skeletonization. For example, (Urban, O'Malley et al., 2006) 
thresholded neuron images (after image processing to remove noise and artefacts), extracting 
the medial axis tree. (Dima, Scholz et al., 2002) employed a 3D wavelet transform to perform 
a multiscale validation of dendrite boundaries, in turn producing an estimate of a skeleton. 


A third approach is exploratory algorithms, where the algorithm starts at a point and uses 
image coherency to trace the cell from there. This avoids having to process all voxels, but 
risks losing parts of the neuron if the images are degraded or unclear. (Al-Kofahi, Lasek et al., 
2002) use directional kernels acting on the intensity data to follow cylindrical objects. 
(Mayerich and Keyser, 2008) use a similar method for KESM data, accelerating the kernel 
calculation by using graphics hardware. (Uehara, Colbert et al., 2004) calculates the 
probability of each voxel belonging to a cylindrical structure, and then propagates dendrite 
paths through it. 


One weakness of these methods is that they assume cylindrical shapes of dendrites and the 
lack of adjoining structures (such as dendritic spines). By using support-vector machines that 
are trained on real data a more robust reconstruction can be achieved (Santamaría-Pang, 
Bildea et al., 2006). 


Overall, tracing of branching tubular structures is a major interest in medical computing. A 
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survey of vessel extraction techniques listed 14 major approaches, with several examples of 
each (Kirbas and Quek, 2004). The success of different methods is modality-dependent. 





Figure 18: 3D visualization of Golgi-stained cell reconstructed from KESM data. 
(Copyright Brain Networks Laboratory, Texas A&M University 





Figure 19: 3D reconstruction of a cube (2 um side) of neuropil from rat hippocampus. 
Axons are green, dendrites ochre, astrocytes pale blue, myelin dark blue. (Copyright J 
Spacek, Synapse Web) 
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Synapse identification 





Figure 20: Reconstructed spiny dendrite of CA1 pyramidal cell, rendered from data in 
(Harris and Stevens, 1989). (Copyright Synapse Web) 


In electron micrographs, synapses are currently recognized using the criteria that within a 
structure there are synaptic vesicles adjacent to a presynaptic density, a synaptic density with 
electron-dense material in the cleft and densities on the cytoplasmic faces in the pre- and 
postsynaptic membranes (Colonnier, 1981; Peters and Palay, 1996). 


One of the major unresolved issues for WBE is whether it is possible to identify the functional 
characteristics of synapses, in particular synaptic strength and neurotransmitter content, from 
their morphology. 


In general, cortical synapses tend to be either asymmetrical "type I" synapses (75-9576) or 
symmetrical “type II" synapses (5-25%), based on having a prominent or thin postsynaptic 
density. Type II synapses appear to be inhibitory, while type I synapses are mainly excitatory 
(but there are exceptions) (Peters and Palay, 1996). This allows at least some inference of 
function from morphology. 


The shape and type of vesicles may also provide clues about function. Small, clear vesicles 
appear to mainly contain small-molecule neurotransmitters; large vesicles (60 nm diameter) 
with dense cores appear to contain noradrenaline, dopamine or 5-HT; and large vesicles (up 
to 100 nm) with 50-70 nm dense cores contain neuropeptides (Hokfelt, Broberger et al., 2000; 
Salio, Lossi et al., 2006). Unfortunately there does not appear to be any further distinctiveness 
of vesicle morphology to signal neurotransmitter type. 
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Identification of cell types 


Distinguishing neurons from glia and identifying their functional type requires other 
advances in image recognition. 


The definition of neuron types is debated, as well as the number of types. There might be as 
many as 10,000 types, generated through an interplay of genetic, posttranscriptional, 
epigenetic, and environmental interactions (Muotri and Gage, 2006). There are some 30+ 
named neuron types, mostly categorized based on chemistry and morphology (e.g. shape, the 
presence of synaptic spines, whether they target somata or distal dendrites). Distinguishing 
morphologically different groups appear feasible using geometrical analysis (Jelinek and 
Fernandez, 1998). 


In terms of electrophysiology, excitatory neurons are typically classified into regular-spiking, 
intrinsic bursting, and chattering, while interneurons are classified into fast-spiking, burst 
spiking, late-spiking and regular spiking. However, alternate classifications exist. (Gupta, 
Wang et al., 2000) examined neocortical inhibitory neurons and found three different kinds of 
GABAergic synapses, three main electrophysiological classes divided into eight subclasses, 
and five anatomical classes, producing 15* observed combinations. Examining the subgroup 
of somatostatin-expressing inhibitory neurons produced three distinct groups in terms of 
layer location and electrophysiology (Ma, Hu et al., 2006) with apparently different functions. 
In prefrontal cortex layer 2/3 inhibitory neurons' morphology and electrophysiology also 
produced clusters of distinct types (Krimer, Zaitsev et al., 2005). 


Overall, it appears that there exist distinct classes of neurons in terms of neurotransmitter, 
neuropeptide expression, protein expression (e.g. calcium binding proteins), and overall 
electrophysiological behaviour. Morphology often shows clustering, but there may exist 
intermediate forms. Similarly, details of electrophysiology may show overlap between 
classes, but have different population means. 


Some functional neuron types are readily distinguished from morphology (such as the five 
types of the cerebellar cortex). A key problem is that while differing morphologies likely 
implies differing functional properties, the reverse may not be true. Some classes of neurons 
appear to show a strong link between electrophysiology and morphology (Krimer, Zaitsev et 
al., 2005) that would enable inference of at least functional type just from geometry. In the 
case of layer 5 pyramidal cells, some studies have found a link between morphology and 
firing pattern (Kasper, Larkman et al., 1994; Mason and Larkman, 1990), while others have 
not (Chang and Luebke, 2007). It is quite possible that different classes are differently 
identifiable, and that the morphology-function link could vary between species. 


In many species there exist identifiable neurons, neurons that can be distinguished from other 
neurons in the same animal and identified across individuals, and sets of equivalent cells that 
are mutually indistinguishable (but may have different receptive fields) (Bullock, 2000). 
While relatively common in small and simple animals, identifiable neurons appear to be a 
minority in larger brains. Early animal brain emulations may make use of the equivalence by 
using data from several individuals, but as the brains become larger it is likely that all 
neurons have to be treated as individual and unique. 
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Estimation of parameters for emulation 


(Markram, 2006) lists the following necessary building blocks for reconstructing neural 


microcircuits: 


Table 6: Necessary data for reconstructing neural microcircuits 
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Many of these are in turn multivariate, such as ionic permeabilities or the list of 


conductances. 


Neuron data in the Neocortical Microcircuit Database (Markram, 2005) include 135 
electrophysiological parameters derived from responses to stimuli, 234 geometric parameters 


based on neural reconstruction and 51 genetic properties from single cell RT-PCR data. For 


synapses data on involved cell identities, anatomical properties (15 parameters), 


physiological properties (reversal potential and pharmacological block), synaptic dynamics (5 


parameters) and kinetics (14 parameters). 


Electrophysiology 


Given electrophysiological data of a cell's responses to simple current stimuli it is possible to 


replicate the response in a compartment model (in particular the density of ionic channels) 


through automated methods such as genetic algorithms and simulated annealing 


(Druckmann, Banitt et al., 2007; Keren, Peled et al., 2005; Vanier and Bower, 1999; Van Geit, 
Achard et al., 2007). However, fitting experiments have shown that the same neural activity 


can be produced by different sets of conductances (Achard and De Schutter, 2006). This may 
suggest less need to detect all ionic conductances if done by other means, but the results also 


suggest that the set of good models form relatively isolated hyperplanes: relatively precise 


fitting of all parameters is needed to get good performance. 
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Gene expression 


Different inhibitory interneurons show different gene expression for different receptors, ion 
channels and gap junction-forming proteins (Blatow, Caputi et al., 2005). Strong correlations 
between gene expression of certain ion channels and conductances have been observed 
suggesting that function can be inferred from genetic data (Toledo-Rodriguez, Blumenfeld et 
al., 2004). Different neurons show variable expression levels, yet different neuron types have 
unique patterns (Schulz, Goaillard et al., 2006; Schulz, Goaillard et al., 2007). Differences in 
alternate splicing have been found in individual cells within a neuron population, but there 
are also differences in splicing in different brain regions suggesting there are overall 
regulatory mechanisms (Wang and Grabowski, 1996). 


If scanning can detect relevant mRNA levels (and the link between these and functional 
properties have been found using high-throughput analysis of all kinds of neurons) it would 
likely be possible to deduce functional type. 


Detecting synaptic efficacies 


Normally synaptic properties are determined electrophysiologically by triggering action 
potentials in the presynaptic neuron and measuring the response in the postsynaptic neuron. 
This can detect not just the general strength of the synapse but also adaptation properties. 


There are cases where electrophysiological properties appear to be linked to detectable 
differences in synaptic morphology, in particular the appearance of vesicle ribbons (Fields 
and Ellisman, 1985; Hull, Studholme et al., 2006). LTP likely affects synaptic curvature, size 
and perforations of the postsynaptic density in a time dependent manner (Marrone and Petit, 
2002; Marrone, 2007). It is commonly assumed that this, as well as more short-term 
electrophysiological changes, involves remodelling of the cytoskeleton in response to 
plasticity (Dillon and Goda, 2005; Chen, Rex et al., 2007). Particular proteins, such as profilin, 
a regulator of actin polymerisation, are activated by LTP-inducing stimuli and may indicate 
synapses that are potentiating (Ackermann and Matus, 2003). 


Many synapses are "silent", lacking membrane AMPA-receptors; stimulation can produce a 
transition to an active "vocal" state (Kullmann, 2003). Silent synapses do not appear different 
in micrographs, suggesting a negative answer to the question of whether function can be 
inferred from EM structure (Atwood and Wojtowicz, 1999). This means that for WBE 
detecting at least AMPA may be necessary; this could likely be achieved using array 
tomography. 


Connectivity identification 


This step assigns synaptic connections between neurons. 


At present, statistical connectivity rules are used based on proximity, the "Peters' rule", in 
which synaptic connections are assumed where axons with boutons overlap with dendrites 
(Braitenberg and Schuz, 1998; Peters, 1979). This can be used to estimate the statistics of 
synaptic connectivity (Binzegger, Douglas et al., 2004; Shepherd, Stepanyants et al., 2005; 
Kalisman, Silberberg et al., 2003). However, neural geometry cannot predict the strength of 
functional connections reliably, perhaps because synaptic plasticity changes the strength of 
geometrically given synapses (Shepherd, Stepanyants et al., 2005). 
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Itis not possible to rely on synapses only occurring from axons to dendrites; axo-somatic, 
axo-axonic, and dendro-dendritic (which may be one-way or reciprocal) synapses have been 
observed. Occasionally, several synapses coincide, such as serial axoaxodendritic synapses 
and synaptic glomeruli where an axon synapses onto two dendrites, one of which also 
synapses on the other. 


One possibility is that there exists enough stereotypy (repeating patterns of structural and 
functional features) in the brain to simplify connectivity identification (and possibly 
interpolation) (Silberberg, Gupta et al., 2002). Such information would constrain the 
interpretation process by ruling out certain possibilities, which would at the very least enable 
a speedup. Acquiring the potentially very complex information about stereotyped 
arrangements would be one of the earliest applications and benefits of detailed scanning and 
massive neuroinformatics databases. However, deviations from stereotypy may be of 
particular importance in achieving person WBE since they can represent individual variations 
or characteristics. 


Gap junctions, where the pre- and postsynaptic cells are electrically linked by connexon 
channels through the cell membrane, can be identified by membranes remaining parallel with 
just 2 nm separation and a grid of connexons. They appear to be relatively rare in mammals, 
but occur at least in the retina, inferior olive and lateral vestibular nucleus (Peters and Palay, 
1996). 


At least in simple nervous systems such as C. elegans gene expression contains significant 
information about its connectivity signature (Kaufman, Dror et al., 2006). At the very least, 
the availability of this kind of information could be used to constrain neuron types and 
connectivity. 


Conclusion 


Image processing is a mature area with many commercial and scientific applications. 
Development of basic processing for handling the scan data does not appear to pose many 
problems beyond the need for extremely high-throughput signal and image processing, and 
perhaps the data management issues of the raw scans. 


The neuroinformatics/WBE-related further processing steps pose a research challenge. 
Identifying cellular objects, in particular connectivity and synapses, is a non-trivial image 
interpretation problem that is currently being studied. Basic contouring algorithms appear to 
work reasonably well, and it is likely, given other results in current image recognition, that 
synapse detectors could be constructed. Image interpretation is traditionally a 
computationally costly operation, and may conceivably be a bottleneck in developing early 
WBE models (in the long run, assuming improvements in computer hardware necessary 
anyway for large-scale emulation, this bottleneck is unlikely to remain). 


The hardest and currently least understood issue is estimating emulation parameters from 
imagery (or developing scanning methods that can gain these parameters). This represents 
one of the key research issues WBE need to answer (if only tentatively) in order to assess its 
viability as a research program. 
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Neural simulation 


The area of neural simulation began with the classic Hodgkin and Huxley model of the action 
potential (Hodgkin and Huxley, 1952). At the time, calculating a single action potential using 
a manually cranked calculator took 8 hours of hard manual labour. Since then the ability to 
compute neural activity across large networks has grown enormously (Moore and Hines, 
1994). 


How much neuron detail is needed? 


Itis known that the morphology of neurons affects their spiking behaviour (Ascoli, 1999), 
which suggests that neurons cannot simply be simulated as featureless cell bodies. In some 
cases simplifications of morphology can be done based on electrical properties (Rall, 1962), 
but itis unlikely this is generic. 


An issue that has been debated extensively is the nature of neural coding and especially 
whether neurons mainly make use of a rate code (where firing frequency contains the signal) 
or the exact timing of spikes matter (Rieke, Warland et al., 1996). While rate codes 
transmitting information have been observed, there also exist fast cognitive processes (such as 
visual recognition) that occur on timescales shorter than the necessary temporal averaging for 
rate codes. Neural recordings have demonstrated both precise temporal correlations between 
neurons (Lestienne, 1996) and stimulus-dependent synchronization (Gray, Konig et al., 1989). 
At present, the evidence that spike timing is essential is incomplete, but there does not appear 
to be any shortage of known neurophysiological phenomena that could be sensitive to it. In 
particular, spike timing dependent plasticity (STDP) allows synaptic connections to be 
strengthened or weakened depending on the exact order of spikes with a precision «5 ms 
(Markram, Lubke et al., 1997; Bi and Poo, 1998). Hence it is probably conservative to assume 
that brain emulation needs a time resolution smaller than 0.4—1.4 ms (Lestienne, 1996) in 
order fully to capture spike timing. 


The time resolution is constrained by the above spike timing requirement as well as numeric 
considerations in modelling action potentials (where the dynamics become very stiff). Ion 
channels open in 0.1 ms, suggesting that emulations that do not depend on individual 
detailed ion channel dynamics might need this time resolution. If they also take individual 
ion channels and enzymes into account, they would have to be 1-3 orders of magnitude more 
detailed. 


One of the most important realizations of computational neuroscience in recent years is that 
neurons in themselves hold significant computational resources. "Dendritic computing" 
involves nonlinear interactions in the dendritic tree, allowing parts of neurons to act as 
artificial neural networks on their own (Single and Borst, 1998; London and Hausser, 2005; 
Sidiropoulou, Pissadaki et al., 2006). It appears possible that dendritic computation is a 
significant function that cannot be reduced into a point cell model but requires calculation of 
at least some neuron subsystems. 


Internal Chemistry 


Brain emulation needs to take chemistry more into account than commonly occurs in current 
computational models (Thagard, 2002). Chemical processes inside neurons have 
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computational power on their own and occur on a vast range of timescales (from sub- 
millisecond to weeks). Neuromodulators and hormones can change the causal structure of 
neural networks, e.g. by shifting firing patterns between different attractor states. 


About 200 chemical species have been identified as involved in synaptic plasticity, forming a 
complex chemical network. However, much of the complexity may be redundant parallel 
implementations of a few core functions such as induction, pattern selectivity, expression of 
change, and maintenance of change (where the redundancy improves robustness and offers 
the possibility of fine-tuning) (Ajay and Bhalla, 2006). 


At the very low numbers of molecules found in synaptic spines, chemical noise becomes a 
significant factor, making chemical networks that are bistable at larger volumes unstable 
below the femtoliter level and reducing pattern selection (Bhalla, 2004b, a). It is likely that 
complex formation or activity constrained by membranes is essential for the reliability of 
synapses. However, this may not necessarily require detailed modelling of the complexes 
themselves, just of their statistics. 


Proteomics methods are being applied to synapses, potentially identifying all present 
proteins (Li, 2007). The Synapse protein DataBase contains about 3,000 human synapse- 
related proteins (Zhang, Zhang et al., 2007), although it is likely that many of these are not 
involved in the actual synaptic processing. 


Of the proteins coded by the human genome, around 3.276 (988) have been predicted to be 
regulatory molecules, 2.8% (868) kinases, 5.0% (1543) receptors, 1.2% (376) signalling 
molecules, and 1.376 (406) ion channels. If the relative proportions are the same among the 
unknown proteins (41.776, 12,809), the numbers should be scaled up by 1.4 (VenterAdams et 
al., 2001). Posttranslational modifications produce on average 3-6 different proteins per gene 
(Wilkins, Sanchez et al., 1996). This places an upper limit on the number of protein types 
directly involved in neural signalling and internal signal transduction on the order of 35,000. 
If all genes were involved we should expect the proteome to be around 158,000 proteins. 
Later estimates have run up all the way to ~1 million proteins (and 600,000 immunoglobulins 
varying in epitope binding, likely irrelevant for WBE) (Humphery-Smith, 2004), although 
most estimates appear to tend towards the ~100,000 range. 


If the requirements for WBE are on the kineome level, then for each involved protein a 
number of species may be needed, depending on the number of possible phosphorylation, 
dimerization, carboxylation, or other altered states that exist. In principle, this could lead to a 
combinatorial explosion of types, requiring storing data for individual protein molecules if 
the number of functionally relevant kinds is very large. 


Itis common to simulate cellular regulatory networks using mass action laws and Michaelis- 
Menten kinetics, although this assumes free diffusion and random collisions. This assumption 
does not always hold, since molecular mobility inside cells is limited by the properties of the 
cytoplasm, compartmentalization, and protein anchoring to surfaces. This can be at least 
partially taken into account by adjusting the equations for "fractal kinetics". Whether fractal 
kinetics or mass action is valid depends mainly on the probability of reactions (Grima and 
Schnell, 2006; Schnell and Turner, 2004; Xu and Ding, 2007). 


It might also be necessary to model gene expression and other epigenetic mechanisms. Long- 


term plasticity in neurons requires changes in gene expression, both as response to plasticity- 
inducing input and in regulating overall plasticity. In addition, gene expression is known to 


65 


drive the circadian rhythm, which affects many aspects of brain function (Levenson and 
Sweatt, 2005; Miller and Sweatt, 2007; McClung and Nestler, 2008). Whether all forms of 
plasticity or other relevant changes can be simulated with sufficient detail without simulating 
the gene expression network is currently unknown. If not, then the brain emulation would at 
the very least have to simulate the contributing gene regulation network and at most a 
sizeable part of the total network. Gene expression timescales are on the order of minutes to 
hours, suggesting that the timestep of expression simulation does not have to be as short as 
for neurons if it is not done on an individual protein level. 


Learning rules and synaptic adaptation 


A WBE without any synaptic change would likely correspond to a severely confused mind, 
trapped in anterograde amnesia. While working memory may be based on attractor states of 
neural activity in the prefrontal cortex and some forms of priming and habituation plausibly 
are based on synaptic adaptation and calcium build-up, long-term memory formation 
requires changes in the strength (and possibly connectivity) of synapses. 


Synaptic learning has been extensively studied in computational neuroscience, starting with 
Donald Hebb's 1949 suggestion that co-occurring neural firing initiated long-term change. 
Since then models on all levels of abstraction have been studied, ranging from the completely 
abstract to detailed synaptic biochemistry. 


Many network models include various synaptic learning rules. There is a wide variety of 
rules used, ranging from simple Hebbian rules with no state beyond the current weight, over 
the "BCM-rule" accounting for LTP/LTD-effects in rate-coding neurons (Bienenstock, Cooper 
et al., 1982) and STDP that counts time differences between spikes (Senn, Markram et al., 
2001), to detailed signal transduction models that include 30+ substances (Kikuchi, Fujimoto 
et al., 2003; Bhalla and Iyengar, 1999). 


Synapses also show various forms of adaptation to repeated firing, including both facilitation 
and depression, which in turn can affect the network dynamics in ways that appear likely to 
have behaviourally relevant consequences (Thomson, 2000). Various models have been 
constructed of how synaptic ‘resources’ are depleted and replenished (Tsodyks, Pawelzik et 
al., 1998). Such models usually include a few extra state variables in the synapses and time 
constants for them. 


For WBE it is important to keep the number of state variables and local parameters low in 
synapses, since they dominate the storage demands for neural/compartment models of the 
brain. At present, we do not have a firm estimate of how complex synapses need to be in 
order to reproduce the full range of plasticity observed. It is very likely that simple weight- 
based models are too simple and that models with a handful of state variables cover most 
observed phenomena. Whether the full transduction cascade needs to be simulated is unclear. 
On the plus side, this is an area that has been subject to intense modelling and research on 
multiple scales for several decades and very accessible to experimental testing. Progress in 
synaptic modelling may be a good indicator of neuroscience understanding. 


Neural models 


The first neural model was the McCulloch-Pitts neuron, essentially binary units summing 
weighted inputs and firing (i.e. sending 1 rather than 0 as output) if the sum was larger than a 
threshold (Hayman, 1999; McCulloch and Pitts, 1943). This model and its continuous state 
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successors form the basis of most artificial neural network models. They do not have any 
internal state except the firing level. Their link to real biology is somewhat tenuous, although 
as an abstraction they have been very fruitful. 


More realistic models such as "integrate-and-fire" sum synaptic potentials and produce 
spikes. 


Single cell-modelling can be done on roughly five levels of abstraction (Herz, Gollisch et al., 
2006): as black box modules that generate probabilistic responses to stimuli according to some 
probability distribution, as a series of linear or nonlinear filters of signals, as a single 
compartment with ionic conductances, as a reduced compartment model (few compartments) 
or as a detailed compartmental model. The more abstract models are theoretically and 
computationally tractable and have fewer degrees of freedom, while the more detailed 
models are closer to biological realism and can be linked to empirical data. According to 
(Herz, Gollisch et al., 2006), most task-specific computation of known direct biological 
relevance can be achieved by reduced compartmental models, with the possible exception of 
some forms of nonlinear dendritic processing. 





A. Characterized Neuron 


B. Cable Model 


C. Compartmental Model 


Figure 21: Compartment models of neurons usually begin with an electrophysiologically 
characterized neuron (A). This can be regarded as a network of electrical cables with 
individual properties (B). These are then subdivided into isopotential compartments. The 
system can then be modeled as an equivalent electronic network (C). Each ion channel type 
corresponds to a pair of parallel-connected potentials and variable resistances per 
compartment. Compartment simulations numerically simulate the equivalent network’. 
Image from (Bower and Beeman, 1998) 


Conductance-based models are the simplest biophysical representation of neurons, 
representing the cell membrane as a capacitor and the different ion channels as (variable) 
resistances. Neurons or parts of neurons are replaced by their equivalent circuits, which are 





8 See http://www.brains-minds-media.org/archive/222 for a tutorial in this methodology. 





67 


then simulated using ordinary differential equations?. Beside the membrane potential they 
have (at least) two gating variables for each membrane current as dynamical variables. 


The core assumptions in conductance-based models are that different ion channels are 
independent of each other; the gating variables are independent of each other, depending 
only on voltage (or other factors such as calcium), first order kinetics in the gating variables; 
and that the region being simulated is isopotential. 


More complex ion channel models with internal states in the channels have been developed, 
as well as models including calcium dynamics (possibly with several forms of calcium 
buffering). 


In simple neuron models, the ^neuronic" (firing rate update) equations can be uncoupled 
from the "mnemonic" (synaptic weight update) equations, the "adiabatic learning 
hypothesis" (Caianiello, 1961). However, realistic models often include a complex interplay at 
synapses between membrane potential, calcium levels, and conductances that make this 
uncoupling difficult to preserve. 


A common guideline in modelling is to use compartments 1/10 to 1/20 of the length constant"? 
of the dendrite (or axon), making potential differences between compartments differ just by 
2576. Typical length constants are on the order of 2-5 mm, giving compartments smaller than 
200-100 um. In heavily branching neurons, the short distance between branches forces finer 
resolution, on the order of 10 um or less. It can therefore be expected that the majority of 
compartments will be due to cortical arbors rather than the long axons through the white 
matter. 


The number of state variables of a neuron at least scales with the number of synapses since 
each synapse has its own dynamics. The number of synapses is also a rough estimate of the 
number of compartments needed for an accurate morphological model. Each compartment 
has to store a list of neighbour compartments, dynamical variables, and local parameters. 
Synapses can be treated as regular compartments with extra information about weight, 
neurotransmitters, and internal chemical state. A synapse-resolution compartment model of 
10!! neurons with on the order of 10* compartments would require 1075 compartments. 


If states in extracellular space matter (e.g. due to volume transmission, ephaptic signals or 
neurogenesis), it may be necessary to divide the simulation into a spatial grid rather than 
compartments following the neurons. A volume-based simulation where the brain is divided 
into size r voxels would encompass 1.4-10?/r? voxels. Each voxel would contain information 
about which cells, compartments, and other information that existed inside, as well as a list of 
the dynamical variables (local electric fields, chemical concentrations) and local parameter 
values. For 10 um side voxels there would be 1.4-10'8 voxels in a human brain. 


? This makes the simplifying assumption that the neuron can be divided into isopotential compartments. A more 
complex model would describe the membrane state by partial differential equations; this would be highly 
cumbersome. Fortunately there are no currently known phenomena that appears to require such models. 

10 The length constant A defines how quickly an imposed voltage decays with distance in a passive cable, V(x)=Vo 
exp(-x/A), A-N((d/4)Rw/RA) where d is the diameter, Rm the membrane resistance per square meter and RA the axial 
resistance per meter. 
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Reduced models 
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Figure 22: Computational cost for different neuron models versus biological plausibility. 
From (Izhikevich, 2004), figure 2. 


(Izhikevich, 2004) reviews both typical neural firing patterns and a number of computational 
models of spiking neurons. He estimates both the number of possible biological features 
different models can achieve and how many floating point instructions are needed per ms of 
simulation (only assuming a soma current, not taking the effects of dendrites and synapses 
into account’): 


Table 7: Neuron model costs 


Model # of biological features FLOPS/ms 
Integrate-and-fire 3 5 
Integrate-and-fire with adapt. 5 10 
Integrate-and-fire-or-burst 10 13 
Resonate-and-fire 12 10 
Quadratic integrate-and-fire 6 7 
Izikhevich (2003) 21 13 
Fitz Hugh-Nagumo 11 72 
Hindmarsh-Rose 18 120 
Morris-Lecar 14* 600 
Wilson 15 180 
Hodgkin-Huxley 19% 1200 


* Only the Morris-Lecar and Hodgkin-Huxley models are “biophysically meaningful” in the 
sense that they attempt actually to model real biophysics, the others only aim for a correct 
phenomenology of spiking. 


The (Izhikevich, 2003) model is interesting since it demonstrates that it may be possible to 
improve the efficiency of calculations significantly (two orders of magnitude) without losing 
too many features of the neuron activity. The model itself is a two-variable dynamical system 
with two model parameters. It was derived from the Hodgkin-Huxley equations using a 
bifurcation analysis methodology keeping the geometry of phase-space intact (Izhikevich, 
2007). While it is not directly biophysically meaningful, it--or similar reduced models of full 
biophysics—may be possible computational shortcuts in brain emulation. Whether such 
reductions can be done depends on whether or not the details on internal neural biophysics 
are important for network-relevant properties such as exact spike-timing. It may also be 





1 To a first approximation each compartment would require the same number of FLOPS making a 
multicompartment model about 3-4 orders of magnitude more demanding. 
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possible to apply reduction methods on sub-neural models, but the approach requires an 
understanding of the geometry of phase space of the system. 


Simulators 


There exist numerous simulation systems at present. Some of the more common are GENESIS 
(GEneral NEural SImulation System) (Wilson, Bhalla et al., 1989; Bower and Beeman, 1998) 
and Neuron (Carnevale and Hines, 2006). For a review and comparisons, see (Brette, Rudolph 
et al., 2007). 


Key issues for neural simulators are numerical stability, extendability and parallelizability. 
The numerical methods used to integrate conductance-based models need to both produce 
accurate approximation of solutions of the governing equations and run fast. This is made 
more problematic by the stiffness of some of the equations. Most neural simulators have been 
designed to be easy to extend with new functions, often producing very complex software 
systems. For WBE, this might not be necessary once the basic neuroscience has been 
elucidated. Neural simulators need to be able to run on parallel computers to reach high 
performance (see section below). This is a key need for WBE. 


Parallel simulation 


Networks, neurons, and compartments are in general just linked to nearby entities and act 
simultaneously, making brain models naturally suited for parallel simulations. The main 
problem is finding the right granularity of the simulation (i.e. how many and which entities 
to put on each processing node) so that communications overhead is minimized, or finding 
communications methods that allow the nodes to communicate efficiently. 


Simulations can be time-driven or event-driven. A time-driven simulation advances one 
timestep at a time while an event-driven simulation keeps a queue of future events (such as 
synaptic spikes arriving) and advances directly to the next. For some neural network models, 
such as integrate-and-fire, the dynamics between spikes can be calculated exactly, allowing 
the simulation efficiently just to jump forward in time to when the next spike occurs (Mattia 
and Giudice, 2000). However, for highly connected networks the times between the arrivals 
of spikes become very short, and time-driven simulations are equally efficient. On the other 
hand, the timestep for time-driven models must be short enough that the discretization of 
spike timing to particular timesteps does not disrupt timing patterns, or various techniques 
for keeping sub-timestep timing information must be employed in the simulation (Morrison, 
Mehring et al., 2005). 


Following (Brette, Rudolph et al., 2007), the computational cost per second of biological time 
is of order cw N/dt + cr F-N-p for time-driven simulations and (cu*cstco) ‘F-N-p for event driven 
simulations. cu is the cost per update, cr the cost for one spike propagation (assumed to be 
smaller than the update cost), co the cost of enqueuing/dequeuing a spike (assumed to be 
constant), N the number of neurons, F the rate of firing, p the number of synapses per neuron 
and dt the timestep. For F=1 Hz, p=10,000 and dt=0.1 ms, the timestep dependent term of the 
first equation is likely to dominate: smaller timesteps can increase the computational cost 
strongly. In the second equation there is no time constant dependency, but the multiplicative 
term is likely to be large and can suffer if the queue management has a nontrivial cost. The 
effective time constant in an event-driven simulation is going to be on the order of 1/Fp, 
making the time- and event-driven simulations about equally fast. If there are gap junctions 
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or dendro-dendritic interactions the time complexity becomes much higher (Brette, Rudolph 
et al., 2007). 


Well-implemented simulations tends to scale linearly with number of processors, although 
various memory and communications bottlenecks may occur and optimal use of caching can 
give even superlinear speedup for some problem sizes (Djurfeldt, Johansson et al., 2005; 
Migliore, Cannia et al., 2006). The main problem appears to be high connectivity, since inter- 
processor communications are a major bottleneck. Keeping communications to a minimum, 
for example by only sending information about when and where a spike has occurred 
(Johansson and Lansner, 2007) or running dendritic sub-trees on the same processor (Hines, 
Markram et al., 2008), improves performance significantly. If brain emulation requires more 
information than this to flow between processing nodes performance will be lower than these 
examples. 


Current large-scale simulations 


Representative current large-scale brain simulations include the following (see also Appendix 
C for further examples). 


A simulation of 16 Purkinje cells with 4,500 compartments receiving input from 244,000 
granule cells took 2.5 hours on a 128-processor Cray 3TE (nominally 76.8 GFLOPS) to 
calculate 2 seconds of simulated activity (Howell, Dyhrfjeld-Johnsen et al., 2000). This implies 
around 0.3 MFLOPS per neuron? and a slowdown factor of 4,500. 


The Blue Brain Project aims at modelling mammalian brains at various scales and levels, in 
particular modelling a complete cortical column in biological detail (Markram, 2006). The 
cortical column uses about 10,000 morphologically complex neurons connected with 108 
synapses on an 8,000 processor BlueGene supercomputer. 


The so far (2006) largest simulation of a full Hodgkin-Huxley neuron network was performed 
on the IBM Watson Research Blue Gene supercomputer using the simulator SPLIT 
(Hammarlund and Ekeberg, 1998; Djurfeldt, Johansson et al., 2005). It was a model of cortical 
minicolumns, consisting of 22 million 6-compartment neurons with 11 billion synapses, with 
spatial delays corresponding to a 16 cm? cortex surface and a simulation length of one second 
real-time. Most of the computational load was due to the synapses, each holding 3 state 
variables??. The overall nominal computational capacity used was 11.5 TFLOPS, giving 0.5 
MFLOPS per neuron or 1045 FLOPS per synapse. Simulating one second of neural activity 
took 5,942 s14. The simulation showed linear scaling in performance with the number of 
processors up to 4,096 but began to show some (23%) overhead for 8,192 processors 
(Djurfeldt, Lundqvist et al., 2006). 


A simulation involving 8 million simple spiking Izhikevich neurons with 6,300 synapse/cell 
with a low firing rate and STDP achieved a slowdown of just 10 on a 4,096 processor Blue 
Gene supercomputer. This was achieved by both optimizing the synaptic updates and by 
reducing the number of inter-processor messages by collecting all spikes from a given node 
that were sent to another node into a single message (Frye, Ananthanarayanan et al., 2007). 





2 This also includes overhead. The Purkinje cell models used significantly more computations per 
neuron than the granule cell models. 

133 Conductance, depression and facilitation. 

* Mikael Djurfeldt, personal communication. 
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An even larger simulation with 10" neurons and 1055 synapses was done in 2005 by Eugene 
M. Izhikevich on a Beowulf cluster with 27 3 GHz processors (Izhikevich, 2005). This was 
achieved by not storing the synaptic connectivity but by generating it whenever it was 
needed, making this model rather ill suited for brain emulation. One second of simulation 
took 50 days, giving a slowdown factor of 4.2 million. 


(Johansson and Lansner, 2007) estimate the computational demands for simulating the 
mammalian neocortex of different species, based on a particular model of its function. 
Assuming that minicolumns organized into hypercolumns are the computational units and 
that these communicate through spike trains and using a cluster architecture, they show that 
it is feasible to run a network of intermediate size between rat and cat on a current 
supercomputer cluster. They claim that using this kind of high-level network (and assuming 
continued linear scaling) a macaque-sized cortex could be run on a Blue Gene/L in real-time. 


Conclusion 


Current neural simulations tend to have neuron numbers and structures fitted to the 
available computers (in particular in terms of connectivity). This is important for the linear 
scaling of speed with processor number and gives a performance improvement at the 
expense of generality. WBE-scale computing will likely have a computational granularity 
fitted to the structure of the brain, unless the computational power available is so large that 
inefficiencies due to a loose fit are irrelevant. 


Models are currently not driven by scanned data, and individual variations in neuron 
properties are generated by drawing from a pre-set random distribution. While this allows 
models that generate needed parameters only when required and hence avoid storing them, 
this is relatively rare and most implementations do use stored parameters. Hence there does 
not seem to be any difference between storage requirements for WBE models and random 
models, assuming the same underlying parameters and structure. 


The speedup of different networks ranges over six orders of magnitude, but most are a 
hundredfold to a thousandfold slower than biology, and a few are close to real-time. This is 
likely more due to research practice than any inherent limitation: model sizes are selected so 
that they fit available computing power. The length of the simulation is set to produce data 
comparable to some biological measurement (which, for neural networks, tend to be a few 
seconds long for neurocognitively interesting cases), and the length of simulation time 
corresponds to what constitutes an acceptable turnaround time for a computer centre or an 
office computer. Smaller simulations can run faster until they hit bottlenecks set by inter- 
processor communications speed. It is hence likely that even early WBE will be close to real- 
time, with computational constraints — rather than speed - limiting the length of the 
simulation. 


We clearly today have computational capabilities sufficient to simulate many invertebrate 
brains (such as snails, ants, and fruit flies) using compartment models on parallel clusters of 
modest size. Small mammalian brains appear computationally within reach. 


Given the differences between models, implementation, computers and other parameters it is 
hard to reliably compare current large-scale models to estimate trends. There is also a lack of 
historical data, as only recently large models have become practically possible. Hence a 
conservative estimate assumes that the models will grow based solely on computer power 
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and not on any algorithmic improvement. If models reliably expressing sublinear 
computational demands appear, they would likely make large-scale computer power 
irrelevant as a limiting factor for WBE. 


The major computational load scales with the number of synapses or (for compartment 
models) compartments, since they are the most numerous entities. Hence, one likely useful 
metric would be the number of synaptic updates per second, as well as the computational 
demands per synapse. 
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Body simulation 


The body simulation translates between neural signals and the environment, as well as 
maintains a model of body state as it affects the brain emulation. 


How detailed the body simulation needs to be in order to function depends on the goal. An 
"adequate" simulation produces enough and the right kind of information for the emulation 
to function and act, while a convincing simulation is nearly or wholly indistinguishable from 
the "feel" of the original body. 


A number of relatively simple biomechanical simulations of bodies connected to simulated 
nervous systems have been created to study locomotion. (Suzuki, Goto et al., 2005) simulated 
the C. elegans body as a multi-joint rigid link where the joints were controlled by 
motorneurons in a simulated motor control network. Órjan Ekeberg has simulated 
locomotion in lamprey (Ekeberg and Grillner, 1999), stick insects (Ekeberg, Bliimel et al., 
2004), and the hind legs of cat (Ekeberg and Pearson, 2005) where a rigid skeleton is moved 
by muscles either modeled as springs contracting linearly with neural signals, or in the case 
of the cat, a model fitting observed data relating neural stimulation, length, and velocity with 
contraction force (Brown, Scott et al., 1996). These models also include sensory feedback from 
stretch receptors, enabling movements to adapt to environmental forces: locomotion involves 
an information loop between neural activity, motor response, body dynamics, and sensory 
feedback (Pearson, Ekeberg et al., 2006). 


Today biomechanical model software enables fairly detailed models of muscles, the skeleton, 
and the joints, enabling calculation of forces, torques, and interaction with a simulated 
environment (Biomechanics Research Group Inc, 2005). Such models tend to simplify muscles 
as lines and make use of pre-recorded movements or tensions to generate the kinematics. 


A detailed mechanical model of human walking has been constructed with 23 degrees of 
freedom driven by 54 muscles. However, it was not controlled by a neural network but rather 
used to find an energy-optimizing gait (Anderson and Pandy, 2001). A state of-the-art model 
involving 200 rigid bones with over 300 degrees of freedom, driven by muscular actuators 
with excitation-contraction dynamics and some neural control, has been developed for 
modelling human body motion in a dynamic environment, e.g. for ergonomics testing 
(Ivancevic and Beagley, 2004). This model runs on a normal workstation, suggesting that 
rigid body simulation is not a computationally hard problem in comparison to WBE. 


Other biomechanical models are being explored for assessing musculoskeletal function in 
human (Fernandez and Pandy, 2006), and can be validated or individualized by use of MRI 
data (Arnold, Salinas et al., 2000) or EMG (Lloyd and Besier, 2003). It is expected that near 
future models will be based on volumetric muscle and bone models found using MRI 
scanning (Blemker, Asakawa et al., 2007; Blemker and Delp, 2005), as well as construction of 
topological models (Magnenat-Thalmann and Cordier, 2000). There are also various 
simulations of soft tissue (Benham, Wright et al., 2001), breathing (Zordan, Celly et al., 2004) 
and soft tissue deformation for surgery simulation (Cotin, Delingette et al., 1999). 


Another source of body models comes from computer graphics, where much effort has gone 
into rendering realistic characters, including modelling muscles, hair and skin. The emphasis 
has been on realistic appearance rather than realistic physics (Scheepers, Parent et al., 1997), 
but increasingly the models are becoming biophysically realistic and overlapping with 
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biophysics (Chen and Zeltzer, 1992; Yucesoy, Koopman et al., 2002). For example, 30 
contact/collision coupled muscles in the upper limb with fascia and tendons were generated 
from the visible human dataset and then simulated using a finite volume method; this 
simulation (using one million mesh tetrahedra) ran at a rate of 240 seconds per frame on a 
single CPU Xeon 3.06 GHz (on the order of a few GFLOPS) (Teran, Sifakis et al., 2005). Scaling 
this up 20 times to encompass «600 muscles implies a computational cost on the order of a 
hundred TFLOPS for a complete body simulation. 


Physiological models are increasingly used in medicine for education, research and patient 
evaluation. Relatively simple models can accurately simulate blood oxygenation (Hardman, 
Bedforth et al., 1998). For a body simulation this might be enough to provide the right 
feedback between exertion and brain state. Similarly simple nutrient and hormone models 
could be used insofar a realistic response to hunger and eating were desired. 


Conclusion 


Simulating a realistic human body is kinematically possible today, requiring computational 
power ranging between workstations and mainframes. For simpler organisms such as 
nematodes or insects correspondingly simpler models could (and have) been used. Since the 
need for early WBE is merely adequate body simulation, the body does not appear to pose a 
major bottleneck. 


However, since many motor actions involve a rich interplay between mechanics and nerve 
signals it should not be surprising if relatively complex models are needed for apparently 

simple actions such as standing or walking. Newly started emulations may need time and 
effort to learn to use their unfamiliar bodies. 
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Environment simulation 


The environment simulation provides a simulated physical environment for the body 
simulation. One can again make the distinction between an adequate environment simulation 
and a convincing simulation. An adequate environment produces enough input to activate 
the brain emulation and allow it to interact in such a way that its state and function can be 
evaluated. A convincing simulation is close enough to reality that the kinds of signals and 
interaction that occurs is hard (or impossible) for the emulated being to distinguish from 
reality. 


It seems likely that we already have the tools for making adequate environments in the form 
of e.g. game 3D rendering engines with physics models or virtual environments such as 
Second Life. While not covering more than sight and sound, they might be enough for testing 
and development. For emulations of simpler brains such as C. elegans simulations with 
simplified hydrodynamics (similar to (Ekeberg and Grillner, 1999)) may be enough, possibly 
extended with simulated chemical gradients to guide behaviour. 


Convincing environments might be necessary only if the long-term mental well-being of 
emulated humans (or other mammals) is at stake. While it is possible that a human could 
adapt to a merely adequate environment, it seems likely that it would experience such an 
environment as confining or lacking in sensory stimulation. Note that even in a convincing 
environment simulation not all details have to fit physical reality perfectly (Bostrom, 2003). 
Plausible simulation is more important than accurate simulation in this domain and may 
actually improve the perceived realism (Barzel, Hughes et al., 1996). In addition, humans 
accept surprisingly large distortions (2076 length change of objects when not paying direct 
attention, 376 when paying attention (Harrison, Rensink et al., 2004)), allowing a great deal of 
leeway in constructing a convincing environment. 


What quality of environment is needed to completely fool the senses? In the following we 
will assume that the brain emulation runs in real-time, i.e., that one second of simulation time 
corresponds to one second of outside time. For slower emulations, the environment model 
would be slowed comparably, and all computational demands divided by the scale factor. 


At the core of the environment model would be a physics engine simulating the mechanical 
interactions between the objects in the environment and the simulated body. It would not 
only update object positions depending on movement and maintain a plausible physics, it 
would also provide collision and contact information needed for simulated touch. On top of 
this physics simulation a series of rendering engines for different senses would produce the 
raw data for the senses in the body model. 


Vision 
Visual photorealism has been sought in computer graphics for about 30 years, and this 


appears to be a fairly mature area at least for static images and scenes. Much effort is 
currently going into such technology, for use in computer games and movies. 


(McGuigan, 2006) proposes a "graphics Turing test" and estimates that for 30 Hz interactive 
visual updates 518.4-1036.8 TFLOPS would be enough for Monte Carlo global illumination. 
This might actually be an overestimate since he assumes generation of complete pictures. 
Generating only the signal needed for the retinal receptors (with higher resolution for the 
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fovea than the periphery) could presumably reduce the demands. Similarly, more efficient 
implementations of the illumination model (or a cheaper one) would also reduce demands 
significantly. 


Hearing 


The full acoustic field can be simulated over the frequency range of human hearing by 
solving the differential equations for air vibration (Garriga, Spa et al., 2005). While accurate, 
this method has a computational cost that scales with the volume simulated, up to 16 TFLOPS 
for a 2x2x2 m room. This can likely be reduced by the use of adaptive mesh methods, or ray- 
or beam-tracing of sound (Funkhouser, Tsingos et al., 2004). 


Sound generation occurs not only from sound sources such as instruments, loudspeakers, and 
people but also from normal interactions between objects in the environment. By simulating 
surface vibrations, realistic sounds can be generated as objects collide and vibrate. A basic 
model with N surface nodes requires 0.5292 N GFLOPS, but this can be significantly reduced 
by taking perceptual shortcuts (Raghuvanshi and Lin, 2006; Raghuvanshi and Lin, 2007). This 
form of vibration generation can likely be used to synthesize realistic vibrations for touch. 


Smell and Taste 


So far no work has been done on simulated smell and taste in virtual reality, mainly due to 
the lack of output devices. Some simulations of odorant diffusion have been done in 
underwater environments (Baird RC, Johari H et al., 1996 ) and in the human and rat nasal 
cavity (Keyhani, Scherer et al., 1997; Zhao, Dalton et al., 2006). In general, an odor simulation 
would involve modelling diffusion and transport of chemicals through air flow; and the 
relatively low temporal and spatial resolution of human olfaction would likely allow a fairly 
simple model. A far more involved issue is what odorant molecules to simulate. Humans 
have 350 active olfactory receptor genes, but we can likely detect more variation due to 
different diffusion in the nasal cavity (Shepherd, 2004). 


Taste appears even simpler in principle to simulate since it only comes into play when objects 
are placed in the mouth and then only through a handful of receptor types. However, the 
taste sensation is a complex interplay between taste, smell, and texture. It may be necessary to 
have particularly fine-grained physics models of the mouth contents in order to reproduce a 
plausible eating experience. 


Haptics 
The haptic senses of touch, proprioception, and balance are crucial for performing skilled 
actions in real and virtual environments (Robles-De-La-Torre, 2006). 


Tactile sensation relates both to the forces affecting the skin (and hair) and to how they are 
changing as objects or the body are moved. To simulate touch, stimuli collision detection is 
needed to calculate forces on the skin (and possibly deformations) as well as the vibrations 
when it is moved over a surface or exploring it with a hard object (Klatzky, Lederman et al., 
2003). To achieve realistic haptic rendering, updates in the kilohertz range may be necessary 
(Lin and Otaduy, 2005). In environments with deformable objects various nonlinearities in 
response and restitution have to be taken into account (Mahvash and Hayward, 2004). 
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Proprioception, the sense of how far muscles and tendons are stretched (and by inference, 
limb location) is important for maintaining posture and orientation. Unlike the other senses, 
proprioceptive signals would be generated by the body model internally. Simulated Golgi 
organs, muscle spindles, and pulmonary stretch receptors would then convert body states 
into nerve impulses. 


The balance signals from the inner ear appears relatively simple to simulate, since it is only 
dependent on the fluid velocity and pressure in the semicircular channels (which can likely 
be assumed to be laminar and homogeneous) and gravity effects on the utricle and saccule. 
Compared to other senses, the computational demands are minuscule. 


Thermoreception could presumably be simulated by giving each object in the virtual 
environment a temperature, activating thermoreceptors in contact with the object. 
Nocireception (pain) would be simulated by activating the receptors in the presence of 
excessive forces or temperatures; the ability to experience pain from simulated inflammatory 
responses may be unnecessary verisimilitude. 


Conclusion 


Rendering a convincing environment for all senses probably requires on the order of several 
hundred TFLOPS. While significant by today's standards, this represents a minuscule fraction 
of the computational resources needed for brain emulation, and is not necessary for meeting 
the basic success criteria of emulation. 


78 


Computer requirements 


WBE requires significant computer power and storage for image processing and 
interpretation during the scanning process, and to hold and run the resulting emulation. Both 
problems appear to be strongly parallelizable. 


A way of estimating the distance between current capabilities and those needed for human 
WBE is to estimate the number of entities/state variables needed to specify the emulation, the 
required time resolution, and compare this to trends in computer hardware. This will depend 
on what level of detail a WBE can be achieved at; full molecular simulation would require 
vastly more computational power than a model using simplified neurons. For a given level of 
simulation, we can then estimate the earliest possible date—assuming that Moore's law 
continues unchanged—when that kind of emulation will be possible for a given amount of 
money. 


Appendix B analyses current computing trends in detail. The main conclusion is that memory 
per dollar increases one order of magnitude per 4.8 years and processing power per dollar 
increases one order of magnitude per 3.7, 3.9 or 6.4 years depending on whether one bases the 
prediction on supercomputer price/performance, supercomputer power or commodity 
computers. We will use the optimistic supercomputer estimate and the more cautious 
commodity estimates to get an overall range. 


It should be noted that the estimates below make merely order of magnitude estimates of the 
number of entities and complexity of their storage; they are quite debatable. However, given 
that an order of magnitude complexity increase only adds circa 5 years to the estimate, exact 
numbers are not necessary. We are also ignoring body and environment simulation because, 
as shown in the next sections, they likely require only a fraction of the brain emulation 
computations. 


Table 8: Storage demands (emulation only, human brain) 


















































Level # entities Bytes per Memory Earliest year, 
entity demands (Tb) $1 million 
1 Computational 100-1,000? 2 ? ? 
module 
2 Brain region 105 regions, 107 3? (2-byte 3105 Present 
connectivity connections connectivity, 1 
byte weight) 
3 Analog network 105 populations, 10° 5 (3-byte 50 Present 
population model connections. connectivity, 1 
byte weight, 1 
byte extra state 
variable) 
4 Spiking neural 10" neurons, 1015 8 (4-byte 8,000 2019 
network connections. connectivity, 4 
state variables) 
5 Electrophysiology 105 compartments x 10 1 byte per state | 10,000 2019 
state variables = 1016. variable 
6 Metabolome 1016 compartments x 10? | 1 byte per state | 10° 2029 
metabolites= 1018, variable 
7 Proteome 1016 compartments x 10° | 1 byte per state | 107 2034 
proteins and metabolites | variable 
= 101, 
8 States of protein 1016 compartments x 10° | 1 byte per state | 108 2038 
complexes proteins x 10 states = 10? | variable 
9 Distribution of 1016 compartments x 105. | 1 byte per state | 10? 2043 
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complexes proteins and metabolites | variable 
x 100 states/locations. 
Full 3D EM map 50x2.5x2.5 nm 1 byte per 109 2043 
(Fiala, 2002) voxel, 
compressed. 
10 Stochastic behaviour 1025 molecules 31 (2 bytes 3.11014 2069 
of single molecules molecule type, 
14 bytes 
position, 14 
bytes velocity, 1 
byte state) 
11 Quantum Either =10% atoms, or Obits ? ? 








smaller number of 
quantum-state carrying 
molecules. 














For the case of a "Manhattan project" spending $105, subtract 14.4 years from these estimates. 


Table 9: Processing demands (emulation only, human brain) 



































Level # entities FLOPS Time-steps | CPU Earliest Earliest 
per persecond | demand year, $1 year, $1 
entity (FLOPS) million million 

(commod- (Super- 
ity computer 
computer estimate) 
estimate) 

1 Computational | 100-1,000? ? ? ? ? ? 

module 

2 Brain region 105 regions, ? ? ? ? ? 

connectivity 107 
connections 

3 Analog 108 1 102 1015 2023 200815 

network populations, 
population 1035 
model connections 
4 Spiking neural | 10" neurons, 10 10° 1015 2042 2019 
network 1015 
connections 
5 Electrophysiol | 105 108 104 102 2068 2033 
ogy compartments | FLOPS 
x 10 state per ms 
variables = per 
1015. compart 
ment. 
6 Metabolome 1016 105 104 1025 2087 2044 
compartments 
x10? 
metabolites- 
1018, 
7 Proteome 1016 105 104 1026 2093 2048 
compartments 
x 10 proteins 
and 
metabolites = 
105. 
8 States of 1016 105 104 107 2100 2052 
protein compartments 
complexes x 10? proteins 
x 10 states = 
1020 
9 Distribution of | 1016 105 106 109 2119 2063 





























15 "Roadrunner" at Los Alamos National Laboratory achieved 1.7 petaflops on May 25, 2008. 
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complexes compartments 
x 10? proteins 
and 
metabolites x 
100 states per 
location = 10%. 





10 Stochastic 1025 molecules | 103 1015 105 2201 2111 
behavior of 
single 
molecules 








11 Quantum Either 241026 Obits 1015-1020 ? ? ? 
atoms, or 
smaller 
number of 
quantum-state 
carrying 
molecules. 


























For the case of a “Manhattan project" spending $109, subtract 19.1/11.1 years from these 
estimates, respectively. 


A rough estimate for simpler brains is that a macaque brain has 14% of the human synapses, 
cat brains 3%, rat 0.26%, mouse 0.1% (Johansson and Lansner, 2007). Assuming simulation 
demands scale by synapse number (likely for compartment models) this means macaque 
emulations on a given level can be achieved 5.4/3.2 years earlier, cat emulations 9.7/5.6 years, 
rat emulations 16/9.6 years and mouse emulations 19/11 (depending on which column above 
is used). Emulations of honey bees (950,000 neurons) and aplysia (20,000) appear feasible 
today at a fairly high scale (51/30, 61/36 years earlier respectively). A slower decade time of 
computer improvement produces a longer gap between animal emulations and human 
emulations. 


Conclusions 


It appears feasible within the foreseeable future to store the full connectivity or even 
multistate compartment models of all neurons in the brain within the working memory of a 
large computing system. 


Achieving the performance needed for real-time emulation appears to be a more serious 
computational problem. However, the uncertainties in this estimate are also larger since it 
depends on the currently unknown number of required states, the computational complexity 
of updating them (which may be amenable to drastic improvements if algorithmic shortcuts 
can be found), the presumed limitation of computer hardware improvements to a Moore's 
law growth rate, and the interplay between improving processors and improving 
parallelism’. A rough conclusion would nevertheless be that if electrophysiological models 
are enough, full human brain emulations should be possible before mid-century. Animal 
models of simple mammals would be possible one to two decades before this. 


16 This can be seen in the ongoing debates about whether consumer GPU performance should be regarded as being in 
the $0.2 range per GFLOPS; if WBE can be mapped to such high-performance cheap special purpose hardware 
several orders of magnitude of improvement can be achieved. 
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Validation 


As computer software increased in complexity, previous methods of debugging became 
insufficient, especially as software development moved from small groups to large projects in 
large organisations. This led to the development of a number of software testing 
methodologies aiming at improving quality (Gelperin and Hetzel, 1988). Currently 
neuroscience appears to be in an early "debugging" paradigm where data and procedures 
certainly are tested in various ways, but usually just through replication or as an ad hoc 
activity when something unexpected occurs. For large-scale neuroscience testing and 
validation methods need to be incorporated in the research process to ensure that the process 
works, that the data provided to other parts of the research is accurate and that the link 
between reality and model is firm. 


An important early/ongoing research goal would be to quantify the importance of each level 
of scale of phenomena to the resultant observed higher brain functions of interest. In 
particular, it is important to know to what level of detail they need to be simulated in order to 
achieve the same kind of emergent behaviour on the next level. In some cases it might be 
possible to “prove” this by performing simulations/emulations at different levels of 
resolution and comparing their results. This would be an important application of early 
small-scale emulations and would help pave the way for larger ones. 


Ideally, a “gold standard" model at the highest possible resolution could be used to test the 
extent to which it is possible to deviate from this before noticeable effects occur, and to 
determine which factors are irrelevant for higher-level phenomena. Some of this information 
may already exist in the literature, some needs to be discovered through new computational 
neuroscience research. Exploration of model families with different levels of biological detail 
is already done occasionally. 


A complementary approach is to develop manufactured data where the ground truth is 
known ("phantom datasets"), and then apply reconstruction methods on this data to see how 
well they can deduce the true network. For example, the NETMORPH system models neurite 
outgrowth which creates detailed neural morphology and network connectivity (Koene, 
2008). This can then be used to make virtual slices. Multiple datasets can be generated to test 
the overall reliability of reconstruction methods. 
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Discussion 


As this review shows, WBE on the neuronal/synaptic level requires relatively modest 
increases in microscopy resolution, a less trivial development of automation for scanning and 
image processing, a research push at the problem of inferring functional properties of 
neurons and synapses, and relatively business-as-usual development of computational 
neuroscience models and computer hardware. This assumes that this is the appropriate level 
of description of the brain, and that we find ways of accurately simulating the subsystems 
that occur on this level. Conversely, pursuing this research agenda will also help detect 
whether there are low-level effects that have significant influence on higher level systems, 
requiring an increase in simulation and scanning resolution. 


There do not appear to exist any obstacles to attempting to emulate an invertebrate organism 
today. We are still largely ignorant of the networks that make up the brains of even modestly 
complex organisms. Obtaining detailed anatomical information of a small brain appears 
entirely feasible and useful to neuroscience, and would be a critical first step towards WBE. 
Such a project would serve as both a proof of concept and a test bed for further development. 


If WBE is pursued successfully, at present it looks like the need for raw computing power for 


real-time simulation and funding for building large-scale automated scanning/processing 
facilities are the factors most likely to hold back large-scale simulations. 
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Appendix A: Estimates of the computational 
capacity/demands of the human brain 


The most common approach is a straightforward multiplicative estimate: given the number of 
neurons, the average number of synapses and an assumed amount of information per 
synapse or number of operations per second per synapse. This multiplicative method has 
been applied to microtubuli and proteins too. 


However, it still might be necessary to store concentrations of several chemical species, 
neurotransmitter types and other data if a biologically realistic model is needed (especially 
the identities of the pre- and postsynaptic neurons). Some estimates of the storage 
requirements of brain emulation are included in the table below. 


Other estimation methods are based on analogy or constraints. (Moravec, 1999) suggested 
exploiting the known requirements of image processing by equating them with a 
corresponding neural structure (the retina), and then scaling up the result. (Merkle, 1989a) 
used energy constraints on elementary neural operations. (Landauer, 1986) attempted an 
estimation based on experimental psychological memory and signal theory. 


Assumption on the order of one bit of information per synapse has some support on 
theoretical grounds. Models of associative neural networks have an information storage 
capacity slightly under 1 bit per synapse depending on what kind of information is encoded 
(Nadal, 1991; Nadal and Toulouse, 1990). Extending the dynamics of synapses for storing 
sequence data does not increase this capacity (Rehn and Lansner, 2004). Geometrical and 
combinatorial considerations suggest 3-5 bits per synapse (Stepanyants, Hof et al., 2002; 
Kalisman, Silberberg et al., 2005). Fitting theoretical models to Purkinje cells suggests that 
they can reach 0.25 bits/synapse (Brunel, Hakim et al., 2004). 


Table 10: Estimates of computational capacity of the human brain. Units have been 
converted into FLOPS and bits whenever possible. Levels refer to Table 2. 








Source Assumptions Computational Memory 
demands 
(Leitl, 1995) Assuming 10% neurons, 1,000 5.1015 bits (but notes 
synapses per neuron, 34 bit ID per that the data can likely 
neuron and 8 bit representation of be compressed). 


dynamic state, synaptic weights 
and delays. [Level 5] 





(Tuszynski, 2006) Assuming microtubuli dimer states | 1075 FLOPS 8:10? bits 
as bits and operating on 
nanosecond switching times. 
[Level 10] 





(Kurzweil, 1999) Based on 100 billion neurons with 2-1016 FLOPS 10? bits 
1,000 connections and 200 
calculations per second. [Level 4] 





(Thagard, 2002) Argues that the number of 103 FLOPS 
computational elements in the 
brain is greater than the number of 
neurons, possibly even up to the 
10” individual protein molecules. 








[Level 8] 
(Landauer, 1986) Assuming 2 bits learning per 1.5-10° bits (10? bits 
second during conscious time, with loss) 











experiment based. [Level 1] 
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(Neumann, 1958) 


Storing all impulses over a lifetime. 


10? bits 





(Wang, Liu et al., 2003) 


Memories are stored as relations 
between neurons. 


108482 bits (See footnote 


17) 





(Freitas Jr., 1996) 


10?? neurons, 1,000 synapses, firing 
10 Hz [Level 4] 


10" bits/second 





(Bostrom, 1998) 


10" neurons, 5-10? synapses, 100 
Hz, each signal worth 5 bits. [Level 
5] 


10" operations per 
second 





(Merkle, 1989a) 


Energy constraints on Ranvier 
nodes. 


2:10'5 operations per 
second (105-1015 ops/s) 





(Moravec, 1999; Morevec, 
1988; Moravec, 1998) 


Compares instructions needed for 
visual processing primitives with 
retina, scales up to brain and 10 
times per second. Produces 1,000 
MIPS neurons. [Level 3] 


108 MIPS 


8-10" bits. 

















(Merkle, 1989a) Retina scale-up. [Level 3] 10!2-10'4 operations per 
second. 

(Dix, 2005) 10 billion neurons, 10,000 synaptic 1016 synaptic ops/s 4-105 bits (for 
operations per cycle, 100 Hz cycle structural information) 
time. [Level 4] 

(Cherniak, 1990) 107? neurons, 1,000 synapses each. 1013 bits 
[Level 4] 

(Fiala, 2007) 1044 synapses, identity coded by 48 | 256,000 terabytes/s 2-1016 bits (for 
bits plus 2x36 bits for pre- and structural information) 
postsynaptic neuron id, 1 byte 
states. 10 ms update time. [Level 4] 

(Seitz) 50-200 billion neurons, 20,000 2-10? synaptic 4-10'5 - 8-105 bits 


shared synapses per neuron with 
256 distinguishable levels, 40 Hz 
firing. [Level 5] 


operations per secon 





(Malickas, 1996) 


10!! neurons, 10?-10* synapses, 100- 
1,000 Hz activity. [level 4] 


105-1035 synaptic 
operations per secon 











1-10"! neurons, each with 104 
compartments running the basic 
Hodgkin-Huxley equations with 
1200 FLOPS each (based on 
(Izhikevich, 2004)). Each 
compartment would have 4 
dynamical variables and 10 
parameters described by one byte 
each. 





1.2-10!8 FLOPS 





1.12-1028 bits 





17 This information density is far larger than the Bekenstein black hole entropy bound on the information 
content in material systems (Bekenstein, 1981). 
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Appendix B: Computer Performance 
Development 


Forecasts of future computer performance are needed to make rough estimates of when brain 
emulation will become feasible in terms of computer storage and processing capacity. The 
following estimates are (with two exceptions) based on data from John C. McCallum's 
datasets of CPU price performance (McCallum, 2003), memory performance (McCallum, 
2007b) and disk performance (McCallum, 2007a). 


Processing Power 


Plotting the performance of computer systems gives an estimate of available total computer 
power of unitary, off-the-shelf systems (Figure 23). 


CPU 


I I I T T T T 








Data 
MIPS = 10280-116409 + year * 0.141556 
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Figure 23: Processing power (MIPS) over time. 


A least squares fit has been made with an exponential curve (straight in this semilog plot). 
The decade time, the time it takes for an order of magnitude increase in MIPS is 47.1 years. 
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Data 
MIPS/$ = 10°260-109288 + year * 0.178929 











MIPS/$ 








(4 is L l L l 1 1 1 l 
1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 
Year 





Figure 24: Processing power per dollar over time. 


A more relevant measure is the amount of computing power a dollar can buy. Using the data 
and adjusting for inflation to 2007 dollars (this has been done in all estimates below) we get a 
decade time of just 5.6 years (Figure 24). The bootstrap 95% confidence interval is 5.3-5.9. 
Present performance is about 1 MIPS per dollar. The trend appears reliable, but possibly 
accelerating. 


Is the rate of development changing? Fitting curves to 20-year intervals of the MIPS/$ 
estimates produces a range of estimates of the development exponent, clustering together 
into two or three groups. The decade times are roughly 4.4, 8.7 and 3.5 years respectively. The 
slowest development speed occurred during the 70's and 80's, perhaps due to the 
proliferation of cheap home computers built with less powerful processors for economical 
reasons. Since then it has picked up speed again. 


Fitting to computers in the same price classes (prices equal within an order of magnitude) 
produce decade times ranging from 4.3 to 7.1 years with the cheapest computers becoming 
more economical fastest. Clearly, the commodity market is driving development fast (but 
parallelisation has the potential to speed development even further, as described below). 


Given these results it looks likely that the trend will continue (for some unknown duration) 


with a decade time of between 3.5 and 8.7 years, with 5.6 as a middle estimate. 


FLOPS vs. MIPS 


These measurements have used MIPS as a measure of computational power, but it is nota 
reliable indicator of actual performance in numeric-heavy applications (e.g. see (Nordhaus, 
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2001) for a criticism). For that, the MFLOPS (Million FLoating Point instructions per Second) 
is more suitable. Unfortunately, it is not as widely tested as the MIPS. 
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Figure 25: Linpack performance per dollar over time. 
The floating-point heavy measure maximum LINPACK performance per dollar, produces a 


graph similar (if noisier) to MIPS (Figure 25). The decade time is 7.7 years, similar to the MIPS 
estimate but with greater uncertainty (the bootstrap confidence interval is 6.5-9.2 years). 
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Figure 26: MIPS vs. MFLOPS. 


The logarithms of estimated MIPS and max FLOPS correlate to 0.79, suggesting a fairly robust 
link between them. Fitting a relationship suggests that FLOPS scales as MIPS to the power of 
0.89, i.e. slightly slower than unity. 


Is this exponent an artefact of early computers where numerical operations were done by one 


or more instructions and not representative of current computers where co-processors and 
other optimizations allow multiple numerical operations in the same clock cycle? 
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Figure 27: Fitted conversion exponent from MIPS to FLOPS for different 5 year intervals. 


Fitting to smaller time intervals (length 5 years) from 1970 to 2000 produced a range of 
exponents (Figure 27). To test their reliability their values were calculated for each interval 
several times, each time with a different sample removed, producing a distribution (shown as 
grey dots) of exponents that allowed a calculation of mean and standard deviation (shown in 
blue). The more reliable exponents are all close to 1, with a statistically significant increase 
since 1986 until 1995. However, any long-term trend towards higher exponents seems hard to 
support. 


Is the effect different in smaller computers than supercomputers? Plotting MIPS vs FLOPS, 


colouring the samples by their price, and then fitting exponentials produces the following 
graph (Figure 28). 
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Figure 28: Relation between MIPS and FLOPS for different price range computers. 


This graph seems to suggest that the more expensive (and presumably more powerful 
computers) produce fewer FLOPS per MIPS than the cheaper ones, which have a slightly 
superlinear relation. However, this may be biased by the small sample of computers in the 
data set for which information about both FLOPS, MIPS, and price is available. 


Altogether, assuming that FLOPS grow as MIPS to the power of 0.8 is probably a safe 
assumption, but the exponent could become >1 if there were a focus on high performance 
calculation in processors (or other computer parts such as graphics cards). There is also a 
sizeable spread of proportionality constants, at least two orders of magnitude. 


Putting the previous considerations together, and assuming a 1 MIPS/$ 2007 we should 
expect F MFLOPS of performance achievable with a price of P dollars in (T/0.8) logio(F/2.3P°8) 
years, assuming a decade time of T. 


Other estimates 


William D. Nordhaus has done historical estimate of computer performance since the 1800's, 
including pre-electronic computers (Nordhaus, 2001) (partially based on data from (Morevec, 
1988)). He uses millions of standardized operations per seconds (MSOPS) as his measure, 1 
MSOPS roughly corresponding to 20 million 32-bit additions!*. MSOPS correlates strongly 
with additions per second and cycles per second. 





18 The calculation used is SOPS = 0.05 ((6 + log» (memory) + word length)/((7 x add time + multiplication 
time)/8)) 


91 


CPU 


























. Data 
. MSOPS/$ = 10359-543015 + year * 0.179863 











+ 
+ + one PY 
* 


= + 


=| 








1840 


Figure 29: Nordhaus data of computer performance per dollar over time. 
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The data show a sharp breakpoint around 1940 when electronic computers began to develop. 


Before this time, performance per price was nearly stationary; afterwards it grew 
exponentially. The decade time is 5.6 years, agreeing with previous estimates. He also 
observes that supercomputer performance per price seems to lag after performance of 


commodity computers. 


The data in (Kurzweil, 1999) tell a similar story, with decade time of 6.7 years. 
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Figure 30: Peak performance of Top500 dataset of supercomputers. 


The Top500 supercomputing list" tabulates the 500 most powerful supercomputers in the 
world. Over 1993-2003 these machines had a doubling time for the Linpack test of 13.5 
months (Feitelson, 2005), giving a decade time of just 3.82 years. Using the full dataset 1993- 
2008 gives a slightly shorter decade time, 3.7 years (this occurs both when using the general 
Linpack performance or the peak performance). A likely reason for the faster growth rate 
than single processor systems is that the number of processors per computer is also 
increasing exponentially. While the data does not include price information, the at present 
(June 2008) top system achieves 1 petaflops performance for ~$100 million, achieving 10 
megaflops/$. This appears to break previous trend predictions, suggesting that extreme high- 
performance systems may actually be improving much faster than the smaller systems. 
However, as smaller systems begin to use parallel computation the same rapid expansion 
may occur there too. 





19 www.top500.org 
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Figure 31: Projected supercomputer performance from www.top500.org. 


Estimating available computing power for WBE 


There are three ways of estimating the computing power available for doing WBE. The first is 
to extrapolate the performance to price ratio of commodity computers, the second is to use 
top supercomputer performance directly, the third is to estimate the performance to price 
ratio for supercomputers. All three have merits and problems. 


Commodity computers represent a lower bound: they are not very parallel so far and the 
economic forces driving them are not all aiming for maximum performance or even 
performance per dollar. We have data stretching back longer for them than for high 
performance computers. Using the above data produces the formula 


[MFLOPS/$] = 2.33-10%88-2007)/0 


where t is the year and D is the decade time of MIPS, 5.6 years (between 3.5 and 8.7 years 
with 95% confidence). 


Absolute supercomputer performance followed a very regular exponential growth between 
1993 and 2006, and has enabled reliable short-term predictions (Strohmaier and Meuer, 2004). 
The growth is faster than for commodity computers due to rapidly increasing parallelisation. 
However, there is reason to believe this parallelisation growth may conflict with the 
requirements of energy, cooling and space in the near future (see below). Since a WBE project 
is likely to be a major research undertaking, having a top 500 supercomputer available is very 
likely (several of the current top supercomputers have been used for computational 
neuroscience); the full cost of the computer may be shared between the WBE project and 
other projects. 
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Assuming a computer on par with the #500 computer, the available power grows as 
[FLOPS] = 8.996-106 -10(t-2008y/D 


Where D is 3.92. Up to two orders of magnitude more computer power is available by using 
more powerful computers. The cost of the computer itself would be on the order of $10 
million. 


Estimating supercomputer performance/price ratios is harder since their prices are not 
generally listed. Using the Roadrunner computer to calibrate (-$100 million) would give a 
current conversion factor of 1.026 petaflops / $100 million, increasing with a decade time of 
3.7 years. The range of ratios at each time also likely spans at least an order of magnitude: the 
"personal supercomputer" Microwulf's price/performance ratio (2007) was $48/Gflop?? 

while the Sun's Sparc Enterprice M9000 mainframe (base price of $511,385) produced 1.03 
TFLOPS of measured performance, making its PPR » $496/Gflop. Similarly custom-built 
hardware can likely improve the ratio by at least 1-2 orders of magnitude (Wehner, Oliker et 
al., 2008). Using this we get an estimate of 


[MFLOPS/$] = 10.26-10¢-2008)/> 


Where D=3.7 years and the uncertainty is at least one order of magnitude at any point in time. 
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Figure 32: RAM memory per dollar over time. 





20 http://www.calvin.edu/~adams/research/microwulf/ 
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The amount of RAM memory per dollar grows neatly exponentially, with a decade time of 4.8 
years. 
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Figure 33: RAM access speed over time. 


The access times of RAM decline in the data, although numbers are only available from the 
late 80's and onwards. The curve fit produces a decade time of 18.9 years. 
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Figure 34: Disc storage per dollar over time. 


A similar trend to the exponential price-performance growth "Moore's law" in computers is 


“Kryder’s law" in disc drives (Walter, 2005). The amount of storage per dollar began to grow 


faster around 1980 (a curve was fitted for 1980-2005), with a decade time of only 3.5 years 
(Figure 34). Relatively few disk access times are available, but fitting an overall exponential 


trend gives a decade time of 29.3 years (Figure 35). 


Given the growing ratio between storage capacity and speed of both RAM and drives, 


updating all of the stored data will take longer and longer: storage capacity is outrunning the 


access speed. This may force finer granularity on brain emulations so that there are more 
processors with their own memory rather than few very powerful processors being limited 


by the amount of data they can request. 
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Figure 35: Disc access time over time. 


Future 


The issue is not whether there exist physical systems able to perform the computations done 
by brains, since such systems already exist: the brains themselves. Rather, the issue is whether 
brain-emulating hardware can be constructed by human ingenuity in the foreseeable future at 
a sufficiently low cost to make WBE feasible. 


While Moore's law and other "exponential" progress laws in computing have held relatively 
steady over decades, they will presumably break at some point. At the very limit, the 
quantized nature of the universe is likely to limit them. Before that point, limitations on 
molecular building materials, light speed, and energy dissipation will prove increasingly 
problematic. 


Semiconductors still have a long way to go (Meindl, Chen et al., 2001). The international 
semiconductor roadmap (ITRSO7, 2007) appears relatively confident within the 2016 time 
horizon, and lists the research challenges needed to continue the trend towards 2022. 
However, this is still relatively close compared to the timescales likely for achieving WBE. 
Worse, improved individual chip performance may not necessarily carry over directly into 
improved supercomputer performance. 


(DeBenedictis, 2004), a study of the limits of current technology trends found that the main 
problem in the not too distant future (beyond 10-20 years) is going to be heat dissipation, at 
least for massively parallel computers”. The expected performance in this "end game" 





?! The exception may be architectures dominated by memory rather than processing, which can remain dense and 
cool to a greater extent. WBE applications, however, are likely to need to update most of the state variables each 
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beyond 2016 would be in the exaflops to zettaflops range (1075-10?!). While cooling can be 
made more effective by dispersing the computation units this introduces communications 
delays. Reversible logic may reduce heat dissipation but the best current implementations 
still dissipate two orders of magnitude more energy than the theoretical minimum. The study 
argues that a cost effectiveness of 5-8 TFLOPS/$ and year (given running costs and pro-rated 
purchase price) is achievable. More efficient cooling, or less dissipative components are 
recognized as important for zettaflops computing (DeBenedictis, Kogge et al., 2006). 


Exactly what could supersede semiconductors remains conjectural but under intense research 
(Welser, Bourianoff et al., 2008; Compafio, Molenkamp et al., 1999). A few of the possibilities 
that are being pursued or considered are: 


e  Y-branch switches are nanoelectronic components that use electrons moving in the 
ballistic regime that are not stopped by barriers (Forsberg and Hieke, 2002). Such 
devices could be used to build reversible logic gates, reducing energy dissipation 
(Forsberg, 2004). 

e Rapid Single Flux Quantum (RSFQ) Logic uses superconducting quantum effects to 
switch, using the flux quantum as a bit (Likharev and Semenov, 1991). It can switch 
extremely fast (hundreds of gigaherz), has a low power consumption and existing 
chip technology can be adapted to make RSFO circuitry (Zinoviev, 1997). Simple 
microprocessors have been demonstrated (Yoshikawa, Matsuzaki et al., 2002). 
However, RSFQ requires low temperatures for superconductivity. 

e Optical computing. Using photons rather than electrons would increase transmission 
speeds if suitable nonlinear components can be found. The field has been pursued on 
and off since the 1960s with a variety of technologies (Sawchuk and Strand, 1984; 
Higgins, 1995). All-optical gates are approaching the performance of their electronic 
counterparts but still need further development (Zhang, Wang et al., 2005). A key 
problem is that optical communication tends to require more power over short 
distances than electronics, due to the need to avoid shot noise. 

e Quantum dot cellular automata (QCA). Electron configurations in patterns of 
quantum dots act as a cellular automaton (Tougaw and Lent, 1994; Li, Wu et al., 2003; 
Robledo, Elzerman et al., 2008; Amlani, Orlov et al., 1999). Since OCA do not transmit 
currents and perform reversible computation (Timler and Lent, 2003) heat dissipation 
is low, and may provide an ideal environment for zettaflops computing 
(DeBenedictis, 2005). While current dots are semiconductors smaller and more 
temperature resistant systems could be constructed out of molecular quantum dots 
(Lieberman, Chellamma et al., 2002). A related technology is "spintronics", where 
magnetic polarisation is used for computing (Allwood, Xiong et al., 2005; Imre, Csaba 
et al., 2006). While also dissipating little heat, speed limitations may make it 
unsuitable for rapid processing. 

e Helical logic, reversible logic using electrons constrained to helical pathways shifted 
by an external field (Merkle and Drexler, 1996). While reversible it requires low 
temperatures to function and has speed limitations. 

e Molecular electronics, where traditional electronics such as diodes or transistors are 
reproduced on the molecular level (Tseng and Ellenbogen, 2001). Logic circuits 
(Bachtold, Hadley et al., 2001) and non-volatile memory (Rueckes, Kim et al., 2000) 
based on carbon nanotube transistors have been demonstrated. 

e Intramolecular nanoelectronics, where switching occurs on the molecule level. This 
would in principle be able to reach the ultimate bit densities of 10? bits/cm?, 





timestep, which means they are unlikely to be storage-dominated. 
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switching cycles of 10 ps and 1 eV energy per bit cycle (Compafio, Molenkamp et al., 
1999). It would either require high-precision self-assembly or assembly through 
molecular nanotechnology. 

e Rod logic, computing using nanoscale rigid "rods" interacting mechanically, was 
originally suggested mostly as a proof-of-concept that computing could be done on 
the nanoscale (Drexler, 1992). Variants of rod logic can implement reversible 
computation (Merkle, 1993), making it suitable for very high density processing. It 
would also require molecular nanotechnology for construction. 


Pessimistic claims are often made to the effect that limitations of current technology are 
forever unsurpassable, or that theoretically possible technological systems such as the above 
will be too costly and practically difficult to ever become feasible. Still, given that computer 
technology has developed in a relatively stable manner despite several changes of basic 
principles (e.g. from flip-flops via core memory to several semiconductor generations, from 
vacuum tubes to transistors to integrated circuits etc) there is no strong reason to assume they 
will break because current technology will eventually be replaced by other technologies. 
Large vested interests in continuing the growth are willing to spend considerable resources 
on closing technology gaps. A more likely end of growth scenario is that the feedback 
producing exponential growth is weakened by changes in the marketplace such as lowered 
demand, lowered expectations, rising production facility costs, long development times or 
perhaps more efficient software”. 


If further miniaturization proves hard to achieve for one reason or another, increased 
performance can still be achieved by using parallelism. If many units are sold, prices will go 
down even if the processors are not more powerful, and a slower exponential growth of MIPS 
per dollar will continue. It should be noted that, since the commodity market has driven 
development of computers very strongly, and at present only about 1576 of humanity owns a 
computer, there are still large untapped markets that will over time become rich enough to 
afford computers and hence drive further technological development. As demonstrated by 
various ingenious uses of GPU chip programming for high performance scientific 
applications, it is entirely possible to profit from a completely unrelated consumer demand if 
the software is adapted to high-performance hardware even when this hardware was 
developed for a completely different purpose (in this case computer games). The main 
concern for WBE is that it is entirely possible for consumer computing to go in a direction less 
suitable for emulation software (e.g. trading processing power for lower power usage). 


2 As an example, the exponential growth of computing power hides software bloat, where new software tends to 
require increasingly more resources to do the same tasks as old versions. This is partially due to trading program 
efficiency for programming simplicity. The expectation of exponentially growing hardware capabilities makes it 
worthwhile to get new hardware often, which together with the bloat feeds the cycle. If hardware development were 
to slow incentives to avoid bloat would emerge, and even for non-bloating software applications the benefits of 
buying new hardware often would lessen. This would reduce the strength of the feedback loop (and possibly remove 
it), severely reducing the increase in computing capabilities. 
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Appendix C: Large-scale neural network 
simulations 


These simulations represent a small sampling of the literature. There is a bias towards 
research aimed at high performance computational neuroscience and developing new 


methods. 


Table 11: Large-scale neural simulations 


Simulation 


(Plesser, Eppler et 
al., 2007) 


(Djurfeldt, 
Lundqvist et al., 
2006) 


(Izhikevich, 2005) 


(Howell, 
Dyhrfjeld-Johnsen 
et al., 2000) 


(Howell, 
Dyhrfjeld-Johnsen 
et al., 2000) 


(Kozlov, Lansner 
et al., 2007) 

(Frye, 
Ananthanarayanan 
et al., 2007) 


(Brette, Rudolph et 
al., 2007; Traub, 
Contreras et al., 
2005) 

(Traub, Contreras 
et al., 2005) 


Type of 
simulation 


Integrate-and- 
fire 


6-compartment 
neurons 


Izhikevich 
neurons, 
Random 
synaptic 
connectivity 
generated on the 
fly. 
Compartment 
(Purkinje 4,500 
comp./granule 1 
comp.) 


5 compartment 
model 

Low complexity 
spiking like 
Izhikevich 
neurons, STDP 
synapses 

14 cell types, 
conductance and 
compartment 
model 

14 cell types, 
conductance and 
compartment 
model, 11 active 
conductances 


Neurons 


12,500 


2.2307 


10u 


16 Purkinje, 
244,000 
granule cells 


60,000 
granule cells, 
300 mossy 
fibers, 300 
Golgi cells, 
300 stellate 
cells, 1 
Purkinje cell 
900 


8:106 


3,560 


3,560 


Synapses 


1.56107 


1.11010 


1015 


10% per 
hemisegment 
6300*8-106— 
5.04-1010 


3,500 gap 
junctions, 
1,122,520 
synapses 
3,500 gap 
junctions, 
1,122,520 
synapses 


Hardware 
and 
Software 


Sun X4100 
cluster, 
2.4Ghz AMD 
Opteron, 
8GB ram, 
MPI, NEST 
IBM Warson 
Research 
Blue Gene, 
SPLIT 


Beowulf 
cluster, 27 
3GHz 
processors 


128- 
processor 
Cray 3TE, 
PGENESIS 
Single 
workstation 
1GB 
memory, 
GENESIS 


SPLIT 


4096 
processor 
BlueGene/L 
256 Mb per 
CPU 

Cray XT3 2.4 
Ghz, 800 
CPUs 


14 cpu Linux 
cluster, an 
IBM e1350, 
dual- 
processor 
Intel P4 Xeon 


Slowdown 
(time 
required for 
1 biological 
second) 

2 


5,942 


4.2:106 


4,500 


79,200 


10 


200? 


Timestep 


0.1 ms 


20 us 


100 us 


1ms 


Event 
driven 


2ys 


Notes 


Supralinear 
scaling until 80 
virtual 
processes for 
105 neuron/10? 
synapses. 
Computing 
requirements: 
11.5 TFLOPS. 
Spatial delays 
corresponding 
16 cm? cortex 
surface. 


Computing 
requirements: 
76.8 GFLOPS 


5954-8516 
equations/CPU 
Gives # 
compartments 
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(Goodman, 
Courtenay Wilson 
et al., 2001) 


(Frye, 2004) 


(Aberdeen, Baxter 
et al., 2000) 


(Kondo, Koshiba et 
al., 1996) 


(Ccortex, 2003) 


(Harris, Baurick et 
al., 2002) 


(Mehrtash, Jung et 
al., 2003) 


(Shaojuan and 
Hammarstrom, 
2002) 


(Johansson and 
Lansner, 2007) 


3 layer cortical 
column, 2596 
GABAergic, 
conductance 
models, synapse 
reversal pot., 
conductance, 
abs. str., mean 
prob. release, 
time const. 
recover depress 
facilitation, 
Four cell types. 
Km, Ka, Kanp 
channels 


ANN 
feedforward 
network trained 
with iterative 
gradient descent 
Hopfield 
network 


Layered 
distribution of 
neural nets, 
detailed synaptic 
interconnections, 
spiking 
dynamics 

Point neurons, 
spikes, AHP, A 
& M potassium 
channels, 
synaptic 
adaptation, 
Spike shape and 
postsynaptic 
conductance 
specified by 
templates. 
Integrate and 
fire, event 
driven pulse 
connections with 
STDP plasticity. 
1-8 dendritic 
segments. 

Palm binary 
network 


BCPNN spiking, 
minicolumns 
hypercolumns 


5,150 per 
node, max 
20*5,150=103, 
000 


1,000/cells 
column, 
2,501 
columns = 
2.5106 


4,083 


1,536 


2-1010 


35,000 cells 
per node 


256,000 


256,000 


1.6:106 


In columns 
192,585 per 
node, max 20 
= 3.85106, 
intercolumn 
1.4-106 


250K per 
column, total 
6.763:105 


1.73:106 


2.4-106 


241055 


6.1:106 
synapses per 
node 
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Order of 
3.21010 


2.01011 


2.4-GHz, 512 
Kbytes of L2 
cache; or, 
e1350 Blade 
1,at 2.8 GHz 
NCS, 30 
dual- 
Pentium III 
1GHz 
processor 
nodes, with 4 
GB of RAM 
per node 


BlueGene, 
1024 CPUs 


Cluster 196 
pentium III 
processors, 
550 MHz, 
384Mb 
Special 
purpose 
chips 

500 nodes, 
1,000 
processors, 1 
Tb RAM. 
Theoretical 
peak of 4,800 
Gflops. 
Cluster, 128 
Xeon 2.2 
GHz 
processors 
with 256GB, 
1Tb disk 


Special 
purpose 
chip, 
connected 
UltraSparc 
processor 500 
Mhz, 640 Mb 
512 
processors, 
192 GB, 250 
Mhz mips 
12K 
processorr 


256 node, 512 
processors, 
Dell Xeon 
cluster, 


165,000 

3,000? 2.338-1010 
spikes per 
processor. 
Memory 
requirement 
110 Mb per 
processor 
Claimed 
performance, 
not 
documented. 

1012.21 

timessteps/ 

ms 

Execution 

time 4 min 7 

sec, -13% MPI 

calls for 

retrieving 

180K training 

vectors, 

~O(10) 

iterations 

9% real-time Weight 

=11.1 updates 47 ms, 
update 


activities 59 ms 
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(Markram, 2006) 


(Traub and Wong, 
1982) 

(Traub, Miles et al., 
1992) 


(Moll and 
Miikkulainen, 
1997) 


Morphologically 
complex 
compartment 
models 


19 compartment 
cells, ionic 
currents 

Binary units 
with binary 
weights 


10,000 108 

100 

1,200 70,000 
79,500 7.82:108 


3.4GHz 8 Gb 
memory, 
peak 
performance 
13.6 Gflop 
BlueGene/L, 
4096 nodes, 
8192 cpus, 
peak 
performances 
22.4 TFLOPS 


IBM3090 
computer 


Cray Y-MP 
8/864 


325 min/1.5 s 
bio time = 
13,000 

0.006 s - 
(retrieval of 
550,000 

patterns; 
assuming 

human 

retrieval 

speed - 1 s) 


0.05 ms 


Plotting the size of the simulations over time does suggest a rapidly increasing scale, 
although there is not enough data to estimate a reliable trend. There does not seem to exist a 


120 kW usage. 
Computational 
demand 
256*13.6 Gflops 


Not 
documented? 


clear separation between biologically detailed models and highly simplified models; some of 


the largest simulations have been detailed compartment models. 


Figure 36: Number of neurons in different large computational neuroscience models. 
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Figure 37: Number of synapses in different computational neuroscience models. 
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Appendix D: History and previous work 


The earliest origins of the mind emulation idea can perhaps be traced back to J.D. Bernal's The 
World, The Flesh, The Devil (1929), where he wrote: 


^Men will not be content to manufacture life: they will want to improve on it. For one 
material out of which nature has been forced to make life, man will have a thousand; 
living and organized material will be as much at the call of the mechanized or 
compound man as metals are to-day, and gradually this living material will come to 
substitute more and more for such inferior functions of the brain as memory, reflex 
actions, etc., in the compound man himself; for bodies at this time would be left far 
behind. The brain itself would become more and more separated into different 
groups of cells or individual cells with complicated connections, and probably 
occupying considerable space. This would mean loss of motility which would not be 
a disadvantage owing to the extension of the sense faculties. Every part would not be 
accessible for replacing or repairing and this would in itself ensure a practical eternity 
of existence, for even the replacement of a previously organic brain-cell by a synthetic 
apparatus would not destroy the continuity of consciousness. 


Finally, consciousness itself may end or vanish in a humanity that has become 
completely etherealized, losing the close-knit organism, becoming masses of atoms in 
space communicating by radiation, and ultimately perhaps resolving itself entirely 
into light." 


Bernal's vision corresponds to a gradual replacement of biology with artificial parts, 
gradually making it unnecessary to keep the brain in one location. 


In the science fiction novel City and the Stars (1956) Arthur C. Clarke described a far future 
city where bodies are manufactured by the central computer, minds stored in its databanks 
downloaded into them, and when an inhabitant dies their mind is stored yet again in the 
computer, allowing countless reincarnations. Other early science fiction treatments were 
Roger Zelanzky's Lord of Light (1968), Bertil Mártensson's Detta ür verkligheten (1968) and 
Rudy Rucker's Software (1979). Since then, mind emulation ("uploading") has become a staple 
of much science fiction?. Of particular note in terms of technological and philosophical 
details are the novels and short stories by Greg Egan (Permutation City, Diaspora, Learning to be 
Me, Transition Dreams etc). 


Brain (and mind) emulation has also been widely discussed in philosophy of mind, although 
more as Gedankenexperimente than possible actual practice (e.g. (Parfit, 1984; Chalmers, 1995; 
Searle, 1980)). 


The first attempt at a careful analysis of brain emulation was a technical report (Merkle, 
1989b) predicting that "a complete analysis of the cellular connectivity of a structure as large 
as the human brain is only a few decades away". The report reviewed automated analysis 
and reconstruction methods, going into great detail on the requirements needed for parallel 
processing of brain samples using electron microscopes and image analysis software. It also 
clearly listed assumptions and requirements, a good example of falsifiable design. 


5 E.g. http://en.wikipedia.org/wiki/Mind transfer in fiction 
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The first popularization of a technical description of a possible mind emulation scenario was 
found in Hans Moravec's Mind Children (1990), where the author describes the gradual 
neuron-by-neuron replacement of a (conscious) brain with software. Other forms of 
emulation are also discussed. 


(Hanson, 1994) was the first look at the economical impact of copyable minds, showing that 
social role-fit brain emulation would likely cause dramatic economic and demographic 
changes . 


One sketch of a person emulation scenario (Leitl, 1995) starts out by the cryonic suspension of 
the brain, which is then divided into cubic blocks « 1mm. The blocks can individually be 
thawed for immunostaining or other contrast enhancement. For scanning, various methods 
are proposed: X-ray fresnel/holographic diffraction, X-ray or neutron beam tomography (all 
risking radiation damage, might require strong staining), transmission EM (requires very thin 
samples), UV-abrasion of immunostained tissue with mass spectrometry, or abrasive atomic 
force microscope scan. While detailed in terms of the cryosuspension methods, the sketch 
becomes less detailed in terms of actual scanning method and implementing the emulation. 
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Appendix E: Non-destructive and gradual 
replacement 


Non-Destructive Scanning 


Non-destructive scanning requires minimally invasive methods. The scanning needs to 
acquire the relevant information at the necessary 3D resolution. There are however several 
serious limitations: 


e The movement of biological tissue, requiring either imaging faster than it can move 
or accurate tracking. In cats, the arterial pulse produces 110-266 um movements 
lasting 330-400 ms and breathing larger (300-950 um) movements (Britt and Rossi, 
1982)". The stability time is as short as 5-20 ms. 

e Imaging has to occur over a distance of 2150 mm (the width of an intact brain) or be 
invasive. 

e The imaging must not deposit enough energy (or use dyes, tracers or contrast 
enhancers) to hurt the organism. 

e The method must not significantly alter the mental or neural state of the subject being 
scanned in order to avoid a possibly significant "observer effect" anomalies and false 
reading that could produce a flawed emulation model. 


Of the possible non-invasive candidates only MRI appears able to fulfil the limitations even in 
principle. Optic imaging, even using first-arriving light methods, would not work across such 
a large distance. X-ray tomography of the intensity needed to image tissue would deposit 
harmful energy (see discussion on X-ray microscopy). 


The resolution of MRI depends on the number of phase steps used, gradient strength, 
acquisition time and desired signal-to-noise ratio. To record micron-scale features in a 
moving brain, very short acquisition times are needed, or a way of removing the movement 
artefacts. Each doubling of spatial resolution divides the signal-to-noise ratio by 8, requiring 
longer acquisition times, stronger fields or more sensitive detectors. Finally, there are also 
problems with tissue water self-diffusion, making resolutions smaller than 7.7 um impossible 
to achieve (Glover and Mansfield, 2002). Given that brain emulation on the synaptic level 
requires higher resolution, this probably rules out MRI as a non-destructive scanning method. 


However, if the brain is frozen, water diffusion and movement do not occur and very long 
acquisition times can be used. One problem with MRI of frozen brain tissue is that limited 
proton mobility reduces the signal; this can be ameliorated by keeping the brain at -19C 
(Longson, Hutchinson et al., 1995; Doyle, Longson et al., 1996). MRI might therefore be a 
possible scanning method for frozen or fixed brains. Since it is not destructive to the tissue it 
may also act as an adjunct to other, destructive, scanning methods. 


2% The authors of the paper suggests as a solution using a cardiac bypass system to create a nonpulsative 
flow of oxygenated blood. 

25 Higher resolutions have been reported based on using micro-coil receivers and very strong magnetic field 
gradients. (Ciobanu, Seeber et al., 2002) reports a resolution of 3.7 x 3.3 x 3.3 um, but the imaging time was 30h and 
the authors were pessimistic about the ability to detect different chemicals. 
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A possible way around the problems of non-invasive scanning would be to use endoscopic 
methods to bring measurement devices into the blood vessels of the brain and image it from 
inside. Endoscopic MRI has been demonstrated in the gastrointestinal system where a RF coil 
was affixed to the tip of an endoscope (Inui, Nakazawa et al., 1998). This is limited by the 
range of the devices, their size and the risks of penetrating vessels. However, it appears 
unlikely that all parts of the brain are close enough to a large blood vessel to allow accurate 
scanning, even leaving out the resolution and movement issues. 


Correlation mapping using nanoprobes (Strout, 2006) has also been suggested. A large 
number (10?) of nanomachines set up residence in or on neurons, recording their activity and 
functional correlations. A more detailed analysis of how nanomachines could sense neural 
activity can be found in (Freitas Jr, 1999, section 4.8.6) together with a sketch of an in vivo 
fibre network with 107? bit/s capacity (Freitas Jr, 1999, section 7.3.1). This would enable 
essentially real-time monitoring of all neurons and their chemical environment (Kurzweil, 
2005, pp. 548-549). Whether just the activity would be enough to map the connectivity and 
memory states of the brain is unclear. However, given the assumed technology both 
intracellular probing and physical connectivity mapping by histonating nanorobots appears 
possible. Whether this would interfere with brain function is hard to estimate (but see (Freitas 
Jr., 2003) for an analysis). The fibre network would have a volume of 30 cm, which is 24176 of 
the normal blood volume in the brain (based on estimates of blood volume in (Leenders, 
Perani et al., 1990)), possibly impeding bloodflow if extended inside vessels. Although the 
capabilities of nanomachines can be constrained by known physics it is not possible today to 
infer enough about machine/cell/tissue interactions to estimate whether non-destructive 
scanning is feasible. 


Overall, the prospects for non-destructive scanning do not look good at present. At the very 
least it would need to involve very invasive endoscopic scanning or the use of advanced 
nanomedicine. 


Gradual replacement 


Scanning might also occur in the form of gradual replacement, as piece after piece of the brain 
is replaced by an artificial neural system interfacing with the brain and maintaining the same 
functional interactions as the lost pieces. Eventually only the artificial system remains, and 
the information stored can be moved if desired (Morevec, 1988). While gradual replacement 
might assuage fears of loss of consciousness and identity” it appears technically very 
complex as the scanning system not only has to scan a living, changing organism but also 
interface seamlessly with it (at least on the submicron scale) while working. The technology 
needed to achieve it could definitely be used for scanning by disassembly. Gradual 
replacement is therefore not likely as a first form of brain emulation scanning (though in 
practice it may eventually become the preferred method if non-destructive scanning is not 
possible). 


It is sometimes suggested that extending the brain through interfaces with external software 
might achieve a form of transfer where more and more of the entire person is stored outside 
the brain, possibly reaching the point where the brain is no longer essential for the composite 
person. However, this would not be brain emulation per se but rather a transition to a 


26 But not necessarily. Searle has argued that replacement would gradually remove conscious experience (Searle, 
1980). Parfit's ‘physical spectrum’ thought experiment involves interpolating between two different people that 
clearly have different identities, and the replacement process could have a similar property (Parfit, 1984). 
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posthuman state. The technical feasibility appears ill-defined given current knowledge. It 
should be noted that even relatively partial such interfaces or life recording (Gemmell, Bell et 
al., 2006) would produce a wealth of useful data for developing brain emulations by acting as 
a yardstick of normal function? 





7 Again, it is sometimes suggested that recording enough of the sensory experiences and actions would be enough to 
produce brain emulation. This is unlikely to work simply because of the discrepancy in the number of degrees of 
freedom between the brain (at least 10 synaptic strengths distributed in a 10? element connectivity matrix) and the 
number of bits recorded across a lifetime (less than 2-1014 bits (Gemmell, Bell et al., 2006)). 
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Appendix F: Glossary 


AFM 

ANN 

ATLUM 
Autoregulation 
Axon 
Blockface 


bouton 


CNS 
Confocal microscopy 


Connectome 
Dendrite 
Exaflop 
Extracellular 
FIBSEM 
FLOPS 


Fluorophore 


FPGA 


GABAergic 


GFLOP 
Glia 


Hypercolumn 
Interneuron 


Kinase 
Ligand 


Metabolome 
MFLOPS 


Microtubule 


Atomic force microscope (sometimes called scanning force 
microscope). 

Artificial Neural Network, a mathematical model based on 
biological neural networks. 

Automatic Tape-Collecting Lathe Ultramicrotome 

Regulation of blood flow to maintain the (cerebral) environment. 
Projection from a nervcell that conducts signals away from the 
neuron’s cell body. 

The surface of an imaged sample, especially when cutting takes 
place. 

The typical synaptic bump that grows between axons and 
dendrites. 

Central Nervous System. 

Optical imaging technique to image 3D samples by making use 
of a spatial pinhole. 

The total set of connections between regions or neurons in a 
brain (Sporns, Tononi et al., 2005). 

Branched projections of neurons that conduct signals from other 
neurons to the cell body. 

1015 FLOPS. 

The environment outside the cells. 

Focused Ion Beam SEM. 

Floating point Operations Per Second. A measure of computing 
speed. 

A molecule or part of molecule that causes fluorescence when 
irradiated by UV light. Used for staining microscope 
preparations. 

Field-Programmable Gate Array. Semiconductor device that 
contains programmable logic and interconnects, allowing the 
system designer to set up the chip for different purposes. 
Related to the transmission or reception of GABA, the chief 
inhibitory neurotransmitter. 

Gigaflop, a billion FLOPS. 

Glia cells are non-neuronal cells that provide support, nutrition 
and many other functions in the nervous system. 

A group of cortical minicolumns organised into a module with a 
full set of values for a given set of receptive field parameters. 

In the CNS, a small locally projecting neuron (unlike neurons 
that project to long-range targets). 

An enzyme that phosphorylates other molecules. 

A molecule that bonds to another molecule, such as a receptor or 
enzyme. 

The complete set of small-molecule metabolites that can be found 
in an organism. 

Millions of Floating point Operations Per Second. A measure of 
computing speed. 

A component of the cell skeleton, composed of smaller subunits. 
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Minicolumn 


MIPS 


Motorneuron 
MRI 
Neocortex 


Neurite 


eurogenesis 
euromodulator 
euromorphic 
euron 
europeptide 
eurotransmitter 








N 
N 
N 
N 
N 
N 


Parallelisation 
PCR 


Petaflop 
Phosphorylation 


PNS 
Potentiation 


SBFSEM 
SEM 
Sigmoidal 


Skeletonization 


Soma 
Spectromicrosopy 


SSET 
SSTEM 
Supervenience 


Synapse 
Synaptic spine 


A vertical column through the cerebral cortex; a physiological 
minicolumn is a collection of about 100 interconnected neurons, a 
functional minicolumn consists of all neurons that share the 
same receptive field. 

Millions of Instructions Per Second. A measure of computing 
speed. 

A neuron involved in generating muscle movement. 

Magnetic Resonance Imaging. 

The cerebral cortex, covering the cerebral hemispheres. 
Neocortex distinguishes it from related but somewhat more 
“primitive” cortex that have fewer than six layers. 

A projection from the cell body of a neuron, in particular from a 
developing neuron where it may become an axon or dendrite. 
The process by which neurons are created from progenitor cells. 
A substance that affects the signalling behaviour of a neuron. 
Technology aiming at mimicking neurobiological architectures. 
A nerve cell. 

A neurotransmitter that consists of a peptide (amino acid chain). 
A chemical that relays, amplifies or modulates signals from 
neurons to a target cell (such as another neuron). 

The use of multiple processors to perform large computations 
faster. 

Polymerase Chain Reaction, a technique for amplifying DNA 
from a small sample. 

1015 FLOPS. 

Addition of a phosphate group to a protein molecule (or other 
molecule), usually done by a kinase. This can activate or 
deactivate the molecule and plays an important role in internal 
cell signalling. 

Peripheral Nervous System. 

The increase in synaptic response strength seen after repeated 
stimulation. 

Serial Block-Face Scanning Electron Microscopy 

Scanning Electron Microscopy. 

S-shaped, usually denoting a mathematical function that is 
monotonously increasing, has two horizontal asymptotes and 
exactly one inflection point. 

Image processing method where a shape is reduced to the set of 
points equidistant from its boundaries, representing its 
topological structure. 

The cell body of a neuron. 

Methods of making spectrographic measures of chemical 
composition in microscopy. 

Serial Section Electron Tomography 

Serial Section Transmission Electron Microscopy 

A set of properties A supervenes on a set of properties B if and 
only if any two objects x and y that share all their B properties 
must also share all their A properties. Being B-indiscernible 
implies being A-indiscernible. 

A junction where one (or more) neurons signal to each other. 
Many synapses have their boutons offset from their parent 
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dendrite through a thinner filament. 


TEM Transmission Electron Microscopy. 

TFLOPS Teraflops, 10? FLOPS. 

Tortuosity A measure of how many turns a surface or curve make. 

V1 The primary visual cortex. 

Voxel A volume element, representing a value on a regular grid in 3D 
space. 
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